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Copyright (c) 1993 - 2004 Compugen Ltd. 



Search time 4744 Seconds 
(without alignments) 
10971.708 Million cell updates/sec 



OM nucleic - nucleic search, using sw model 
Run on: March 22, 2004, 11:07:47 ; 

Title: US-10- 069-54 1-5 

Perfect score: 1743 

Sequence: 1 atggctttccatgtggaagg ctgaagataatttacagtga 1743 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 55026578 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



EST: 
1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 



em_estba : * 
em_esthum: * 
em__estin: * 
em_estmu: * 
em_estov: * 
em_estpl : * 
em_estro : * 
em_htc: * 
gb_estl : * 
gb_est2 : * 
gb_htc : * 
gb_est3:* 
gb_est4 : * 
gb_est5 : * 
em_estf un : * 
em_estom: * 
em__gss_hum: * 
em_gss_inv: * 
em_gss_pln : * 
em_gss__yrt : * 
em_gss_fun: * 
em_gss mam: * 
em_gss_mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg : * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No . 


Score 


Match 


Length 


DB 


ID 


Des crip t ion 




1 


1473 


84 . 


5 


1743 


29 


AY413298 


AY4132 98 Homo sapi 




2 


1463 


83. 


9 


1743 


29 


AY413299 


AY413299 Pan trogl 




3 


1375 


78 . 


9 


4097 


11 


AK053063 


AK053063 Mus muscu 




4 


1375 


78. 


9 


4306 


11 


AK034415 


AK034415 Mus muscu 




5 


1156 . 2 


66. 


3 


1743 


29 


AY413300 


AY4 13300 Mus muscu 






518 . 8 


29. 


8 


672 


29 


AG157499 


AG157499 Pan trogl 




7 


472 . 2 


27. 


1 


707 


14 


CD350164 


CD350164 UI-M-FY0- 




8 


462 . 8 


26. 


6 


669 


13 


BY727598 


BY727598 BY727598 




9 


404 


23. 


2 


516 


10 


BE233479 


BE233479 139685 MA 




10 


329 . 8 


18. 


9 


650 


10 


BB626260 


BB626260 BB626260 




11 


312 . 8 


17. 


9 


541 


10 


AW668962 


AW668962 111664 MA 




12 


290 


16. 


6 


675 


13 


BY729567 

IS X / €-* S <S \S t 


BY729567 BY729567 




13 


274 . 2 


15. 


7 


524 


10 


BE723927 


BE723927 198406 MA 




14 


263 . 2 


15. 


1 


800 


9 


AL669749 


AL669749 AL669749 




15 


225 . 8 


13. 


0 


549 


13 


BW274870 


BW274870 BW274870 




16 


212 . 4 


12. 


2 


1037 


9 


AL666817 


AL666817 AL666817 




17 


210 


12. 


0 


941 


14 


CD360297 


CD360297 AGENCOURT 




18 


209.2 


12. 


0 


641 


12 


BI630566 


BI630566 RH59836.5 




19 


207 . 4 


11. 


9 


640 


12 


BI629504 


BI629504 RH58381.5 




20 


205.2 


11. 


8 


658 


12 


BM629925 


BM629925 170006875 




21 


203 . 2 


11. 


7 


652 


10 


BB626456 


BB626456 BB626456 




22 


193 . 6 


11. 


1 


583 


13 


BW277281 


BW277281 BW277281 




23 


192 . 6 


11. 


0 


605 


13 


BQ829470 


BQ829470 LL6in2176 




24 


186.4 


10. 


7 


624 


12 


BJ122485 


BJ122485 BJ122485 




25 


183 


10. 


5 


681 


14 


CD306544 


CD306544 StrPu691. 




26 


178.6 


10. 


2 


565 


12 


BJ125564 


BJ125564 BJ125564 




27 


177.4 


10 . 


2 


576 


14 


CB391304 


CB391304 OSTF149A8 




28 


176. 6 


10. 


1 


310 


9 


AL918603 


AL918603 AL918603 




29 


166 


9. 


5 


604 


9 


AU199794 


AU199794 AU199794 




30 


163. 4 


9. 


4 


500 


9 


AV994375 


AV994375 AV994375 




31 


163.2 


9. 


4 


555 


12 


BJ117801 


BJ117801 BJ117801 




32 


158.4 


9. 


1 


646 


9 


AB078155 


AB078155 AB078155 


c 


33 


157.4 


9. 


0 


801 


13 


BW002036 


BW002036 BW002036 


c 


34 


155 


8. 


9 


561 


28 


AQ316435 


AQ316435 RPCI11-10 




35 


153 


8. 


8 


584 


12 


BJ105382 


BJ105382 BJ105382 




36 


145.2 


8. 


3 


500 


12 


BP186503 


BP186503 BP186503 


c 


37 


134 


7. 


7 


632 


28 


AZ612750 


AZ612750 1M0439J17 




38 


128.2 


7. 


4 


500 


12 


BJ105730 


BJ105730 BJ105730 


c 


39 


127.6 


7. 


3 


618 


28 


AZ908709 


AZ908709 RPCI-24-2 




40 


121 


6. 


9 


926 


29 


CNS04L3 J 


AL295624 Tetraodon 


c 


41 


117.2 


6. 


7 


525 


12 


BI508286 


BI508286 BB170004A 


c 


42 


117.2 


6. 


7 


530 


12 


BI503332 


BI503332 BB170012A 




43 


112.8 


6. 


5 


355 


9 


AU209671 


AU209671 AU209671 


c 


44 


112.4 


6. 


4 


558 


12 


BI507950 


BI507950 BB170010A 


c 


45 


108 


6. 


2 


420 


12 


BI506529 


BI506529 BB170027B 



ALIGNMENTS 



RESULT 1 
AY413298 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



gene 
ORIGIN 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



AY413298 1743 bp DNA linear GSS 12-DEC-2003 

Homo sapiens HCM4844 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY413298 

AY413298. 1 GI: 397 692 60 
GSS. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 1743) 

Clark, A. G. , Glanowski , S . , Nielson,R., Thomas, P., Ke j ariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 
Clark, A. G. , 
Todd, M. A. , 
Ferriera, S 



to 1743) 

Glanowski, S . , Nielson,R., Thomas, P., Kejariwal,A. 
Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
, Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 



Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualif iers 

1. .1743 

/organism="Homo sapiens" 
/mol_type-" genomic DNA" 
/db_xref="taxon: 9606" 
<1. ->1743 

/ locusj ag= " HCM4 844" 



Query Match 84.5%; 
Best Local Similarity 84.5%; 
Matches 1473; Conservative 



Score 1473; DB 29; 
Pred. No. 0; 
0; Mismatches 27 0; 



Length 1743; 



Indels 



0; Gaps 



0; 



QY 
Db 



1 AT GGCTTT C CAT GT GGAAGGAC T GAT AGCT AT CAT C GT GT T CT AC CT T CTAAT T TT G CT G 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 



Qy 

Db 



61 GT T GGAAT AT GGGCT GC CT GGAGAAC CAAAAAC AGT GGC AGC G C AGAAGAGCG CAGC GAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

61 GT T GGAAT AT GGG CT GC CT GGAGAAC CAAAAACAGT G GC AGC GCAGAAGAGC G C AG C GAA 120 



Qy 121 GC C AT C AT AGTT GGT GG C C GAGATAT T GGT T TAT T G GTT G GT GGAT T TAC CAT GACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 G C CAT C AT AGT TGGTGGCC GAGATAT T GGT T TAT TGGTTGGTG GAT T TAC CAT GAC AGNN 180 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

Db 181 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 240 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Db 241 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 300 

Qy 301 T T CT T T GCAAAAC CT AT GC GT T C AAAG GGGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

Db 301 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 360 

Qy 361 AT CTAT GGAAAACGCAT GGGCGGACT CCT GTTTATT CCT GCACT GAT GGGAGAAAT GTT C 420 

Db 361 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 420 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 NNNNNNNNNNNNNNNNNNNNNNNNNNNNGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

Qy 4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I j 
Db 4 81 AT GCAC AT T T CT GT CAT CAT CT CT GC ACT C ATT GC C AC T CT GT ACAC ACT GGT G GGAGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i II I I I I I 
Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I 1 1 I I I 1 1 1 1 1 I I I I I I 1 1 1 I i I 1 1 I 1 1 I I I I I I I I I I I 1 1 1 1 I I I 1 1 1 I I 1 1 1 I I I I 

Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 CAT GC C AAAT AC CAAAAG CCGTGGCTGG GAACT GT T GACT C AT CT GAAGT C T ACT CT T G G 720 

I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 661 CAT GC CAAATACCAAAAGC C GT GG CT G GGAACT GTT GACT CAT C T GAAGT CT ACT CT T GG 720 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

Qy 7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

Qy 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

Qy 901 AAC CAGACT G CAT AT G GGCTT C C AGAT C C C AAGAC T AC AGAAGAGG C AGAC AT GAT T T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | 
Db 901 AAC C AG AC T G CAT AT G G GCT T C C AGAT C C C AAGAC T AC AGAAGAG G C AGAC AT GAT T T T A 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 108 0 

1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGi\AATCGTTTGGGTT 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | I I I I I | | M | | | | | | | | | | | I I I I 
1081 CG GAAC AT CTAC CAGCT TT C CT T CAGACAAAAT GCT T C G GACAAAGAAAT C GT T T G G GT T 114 0 

1141 AT GCGAATCACAGTGTTTGT GTTT GGAGCATCT GCAACAGCCAT GGCCTT GCT GACGAAA 1200 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

132 1 T C T GGC CT CT T C CT GAGAATAACT GGAGGGGAGC CAT AT CTGT AT CT T CAGC C CTT GAT C 138 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 T CT GGC CT CT T CC T GAGAATAACT GGAGG GGAGC CAT AT CT GT AT CT T CAGC C CTT GAT C 138 0 

1381 TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTA7\A 144 0 

I I I I I I I I 1 1 I I I I I 1 I I I I II I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I 

1381 TT CTAC C CT G GCT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

1441 AC ACT T GC C AT GGT T AC AT CAT T CT T AAC CAAC AT T T G CAT C T C CT AT C T AGC C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
1441 AC AC T T G C CAT GGT T AC AT CAT T C T T AAC CAAC AT T T G CAT C T C C TAT C T AG C C AAGT AT 1500 

1501 CT AT T T GAAAGT G GAAC CTT GC C AC CT AAAT T AGAT GT AT T T GAT GCT GT T GTT GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1501 CT AT T T GAAAGT GGAAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT GCT GT T GT T GCAAGA 1560 

1561 C AC AGT GAAGAAAACAT GG AT AAGAC AAT T C T T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
1561 CACAGT GAAGAAAACATGGATAAGACAATT CTT GT CAAAAAT GAAAATATTAAATT AGAT 1620 

1621 GAACTT GCACTT GT GAAGCCAC GACAGAGCAT GACCCTCAGCT CAACT TT CAC CAAT AAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | I I I I I I I I I I I I I I I I I I 
1621 GAACT T G CACTT GT GAAGC CAC GACAGAGCAT GAC C CT CAGCT CAACT TT CAC CAAT AAA 1680 

1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 1740 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I II I I II I I I I I I I I I I 
1681 GAGGCCTT CCT T GAT GT T GAT T C CAGT C CAGAAGGGT CT GG GACT GAAGATAAT T TACAG 174 0 

1741 TGA 1743 
I I I 

1741 TGA 1743 



AY413299 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



gene 
ORIGIN 



AY413299 1743 bp DNA linear GSS 12-DEC-2003 

Pan troglodytes HCM4844 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY413299 

AY413299. 1 GI : 39769261 
GSS. 

Pan troglodytes (chimpanzee) 
Pan troglodytes 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pan. 

1 (bases 1 to 1743) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Ke j ariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 1743) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J . J . , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualifiers 

1. .1743 

/organism="Pan troglodytes" 

/mol^ype^" genomic DNA" 

/db_xref="taxon: 9598" 

<1. .>1743 

/ locus_tag= "HCM4 844" 



Query Match 83.9%; Score 1463; DB 29; Length 1743; 

Best Local Similarity 84.1%; Pred. No. 0; 

Matches 1466; Conservative 0; Mismatches, 277; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

1 AT GG CTT T CC AT GT G GAAGGACT GATAGCT AT CAT CGT GT T CT AC CT T CTAATT T T GCT G 60 



Qy 

Db 



61 GTT GGAAT AT G GGC T GC CT GGAGAACCAAAAACAGT GGCAG C GCAGAAGAGC GC AGC GAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' 

61 GT T GGAAT AT GGGCT GC CT G GAGAACCAAAAAC AGT G G CAGC GCAGAAGAGC GC AGC GAA 120 



Qy 



Db 



121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 G C CAT CAT AGT TGGTGGCC G AGAT AT T G GT T TAT TGGTTGGTG GATT T AC CAT G AC ANNN 180 



Qy 



181 ACCT GGGT CG GAGGAG G GT AT AT CAAT G GCACAG C T GAAGC AGT T TAT GT AC CAGGT TAT 240 



Db 181 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Db 241 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 300 

Qy 301 T T CT T T GC AAAACCT AT GC GTT CAAAGGGGT AT GT GAC CAT GT T AGAC C C GT TT C AGCAA 360 

Db 301 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 360 

Qy 361 AT CTATGGAAAACGCATGGGCGGACT CCT GTTTATT CCT GCACTGATGGGAGAAAT GTT C 42 0 

Db 361 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 42 0 

Qy 421 T GGGCT GCAGCAATT T T CT CT GCTT T GGGAGC CAC CAT C AG C GT GAT CAT C GAT GT GGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 NNNNNNNNNNNNNNNNNNNNNNNNNNNNGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

Qy 481 AT GC ACAT TT CT GT CATC AT CT CT GCACT CAT T GC C ACT CT GTAC ACACT GGT GGGAGGG 54 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AT GCACAT T T CT GT CAT CAT C T CT GCACT CAT T GC C ACT CT GTACACACT GGT GGGAGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I i I I I I I I I I I I I I 

Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

Qy 7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 8 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

Qy 841 T GCCT GGT GAT GGCCATCCCAGCCATACTCATTGGGGCCATT GGAGCATCAACAGACTGG 900 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 TGCCTGGT GAT GGC CAT C CCAGC CAT ACT CAT CG GGG C CAT T GGAGCAT CAACAGACT GG 900 

Qy 9 01 AAC C AG AC T G CAT AT G G G C T T C C AG AT C C C AAGAC T AC AG AAGAG G C AG AC AT GAT T T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 TCTGCTGCTGT TAT GT C ATC AGC AGAT T CTT CCAT CT T GT C AG CAAGT T C CAT GT TT GCA 1080 



Qy 

Db 



1081 
1081 



C G GAACAT CT AC CAGCTTT C CT T C AGACAAAAT GCT T C GGACAAAGAAAT CGT T T GGGT T 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
C G GAACAT C T AC C AG CTTTCCTT C AG AC AAAAT G C T T C G G AC AAAG AAAT CGTTTGGGTT 



1140 
1140 



Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAG 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 138 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

Qy 1381 T T CT ACCCT GGCT AT T AC C CT GAT GAT AAT GGTAT AT AT AAT CAGAAAT T T C CAT T TAAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 T T CT AC CCT GGCT AT T AC C CT GAT GAT AAT GGTAT AT AT AAT CAGAAAT T T C CAT T TAAA 1440 

Qy 1441 AC ACT T G C CAT GGT T ACAT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT C TAG C C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 1441 ACACT T GC CAT G GTT ACGT CAT T CT T AAC CAACAT T T GC GTCT C CT AT CT AGC CAAAT AT 1500 

Qy 1501 C TAT T T GAAAGT GGAACCT T GC C AC CTAAATT AGAT GT ATTT GAT GCT GT T GT T GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

Qy 1561 CACAGT GAAGAAAACAT GGATAAGACAAT T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 CACAGT GAAGAAAACAT GGATAAGACAAT T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

Qy 1621 GAACTTGCACTT GTGAAGCCACGACAGAGCAT GACCCTCAGCTCAACTTTCACCAATAAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAAC T T GC ACT T GT GAAGC CAC GAC AGAGC AT GAC C CT CAGCT CAACT T T CAC CAATAAA 1680 

Qy 1681 GAGGCCTT CCTT GAT GTT GATTCCAGTCCAGAAGGGTCT GGGACT GAAGATAATTTACAG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 GAGGC CT T CCT T GAT GTT GAT T CCAGT C CAGAAN G GT CT GGGACT GAAGAT AAT T TAC AG 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 
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AK053063 4097 bp rnRNA linear HTC 20-SEP-2003 

Mus mus cuius 15 days embryo head cDNA, RIKEN full-length enriched 
library, clone : D930038E20 product : solute carrier family 5 (choline 
transporter), member 7, full insert sequence. 
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AK053063.1 GI: 26343192 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 
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TITLE Direct Submission 

JOURNAL Submitted (16-JUL-2001) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL :http: //genome .gsc. riken. go. jp/, Tel : 81-45-503-9222, 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL:http: // genome. gsc. riken. go. jp/ 
URLrhttp: // fantom.gsc. riken. go. jp/ . 
FEATURES Location/Qualifiers 
source 1. .4097 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/ db_x r e f = " FANTOM_DB :D930038E20 M 

/db_xref="MGI: 2424012" 

/db_xref="taxon: 10090" 

/clone="D930038E20" 

/ tissue_type="head" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="15 days embryo" 
CDS 512. .2254 

/note="unnamed protein product; putative 

solute carrier family 5 (choline transporter) , member 7 
(MGD|MGI:1927126, GB | NM_022 025 , evidence: BLASTN, 99%, 
match=1743)" 
/codon_start=l 
/protein_id-"BAC35253.1" 
/db_xref ="GI : 26343193" 

/translation="MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIV 
GGRDIGLLVGGFTMTATWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFF 
AKPMRSKGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVD 
VNI SVI VSALI AI LYTLVGGLYSVAYTDWQLFCI FI GLWI SVPFALSHPAVTDIGFT 
AVHAKYQS PWLGT I ESVEVYTWLDNFLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FL 
AAFGCLVM7VLPAICIGAIGASTDWNQTAYGYPDPKTKEEADMILPIVLQYLCPVYISF 
FGLGAVSAAVMS SADS S ILSAS SMFARNI YQLS FRQNAS DKE I VWVMRI T VLVFGAS A 
TAMALLTKTVYGLWYLS SDLVYI 1 1 FPQLLCVLFI KGTNTYGAVAGYI FGLFLRITGG 
EPYLYLQPLIFYPGYYSDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKYLFESGTLP 
PKLDVFDAWARHSEENMDKTILVRNENIKLNELAPVKPRQSLTLSSTFTNKEALLDV 
DSSPEGSGTEDNLQ" 

ORIGIN 



Query Match 78.9%; Score 1375; DB 11; Length 4097; 

Best Local Similarity 86.8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0; 



Qy 

Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 
III I I I I I I I II I I I I I I I I I II I I I I I III I I I I I I II I II II II III 
512 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 571 



Qy 

Db 



61 
572 



GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 

GT T GGAAT AT GG G CT GCAT GGAAAAC CAAAAAC AGC GG CAAC C C AGAAGAGC GCAGT GAA 



120 
631 



Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I II I I I I I I II I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 632 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 691 

Qy 181 ACCT GGGTC GGAGGAGGGTAT AT CAAT GGCACAGCT GAAGCAGTTTATGTACCAGGTTAT 240 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 692 ACCT GGGTT GGAGGAGGCTACAT CAAT GGGACAGCAGAAGCAGT GTAT GGGCCAGGTTGT 751 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 752 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 811 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II II I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I II 
Db 812 TTTTTTGC G AAAC C TAT G C GT T C C AAG G GAT AT GT GAC TAT GT TAG AC C CAT T T CAAC AG 871 

Qy 361 AT CT AT GGAAAACGCAT GGGC GGACT C CT GTTTATT CCT GCACT GAT GGGAGAAAT GTTC 420 

I I I I I I I I 1 I I I I I I I I I I II II II II II I I I I I I I I I I I I I I I I I I I I I I I 
Db 872 AT CTAT G GAAAGC GCAT GGGT GG GCT G CT CT T CAT C C CT GCACT GAT GGGAGAGAT GT TC 931 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 932 TGGGCT GCAGCAATTTT CTCT GCATTAGGGGCCACCATCAGCGT GAT CATT GAT GTGGAT 991 

Qy 481 AT GCACATTT CT GT CAT CAT CT CT GCACT CATT GCCACT CT GTACACACT GGT GGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 992 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 1051 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I I I I I I II I I I I I II I I! I I I I I I I I I I I I I I I I 

Db 1052 CT CT AC T CT GTGGCAT ATACT GAT GT T GT C CAG CT AT T CT G CAT TT T T AT AGGACT GT GG 1111 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1112 AT CAGT GT CCCTTTT GCCCT GT CACAT CCT GCAGT CACC GACAT CGGATT CACAGCT GTG 1171 

Qy 661 CAT GCCAAAT ACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CTACT CTT GG 72 0 

I I I I I I I I II I I I I II I I I I I I I I I I I I I I I III I I I I I I I I I I I III 
Db 1172 CAT GCT AAAT AC CAGAGT CC CT GGCT GGGAAC CATT GAAT CAGT T GAAGT CTACACCT GG 1231 

Qy 721 CTT GAT AGT TTTCTGTTGTT GAT GCT G GGT GGAAT C C CAT GGC AAG CAT ACT T T C AGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1232 CTT GAT AAT TTT CT GT TAT T GAT GCT GGGT GGAAT C C CAT GG CAAGC CTACT T C CAGAGG 1291 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I III 

Db 1292 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 1351 

Qy 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I I I I I I I I I I I I II I I I I I I III II II I I I I I I I I II I I I I I I I I I 

Db 1352 TGCCTGGTGATGGCTCTACCCGCCATATGCATAGGAGCTATTGGAGCTTCCACAGACTGG 1411 

Qy 901 AAC C AGACT G CAT AT GGG CT T C C AGAT C C C AAGACT AC AGAAGAGG CAG AC AT GAT TT T A 960 



Db 


1412 


Qy 


961 


Db 


1472 


Qy 


1021 


Db 


1532 


Qy 


1081 


Db 


1592 


Qy 


1141 


Db 


1652 


Qy 


1201 


Db 


1712 


Qy 


1261 


Db 


1772 


Qy 


1321 


Db 


1832 


Qy 


1381 


Db 


1892 


Qy 


1441 


Db 


1952 


Qy 


1501 


Db 


2012 


Qy 


1561 


Db 


2072 


Qy 


1621 


Db 


2132 


Qy 


1681 


Db 


2192 


Qy 


1741 



1 1 1 1 1 1 1 1 1 1 1 II III 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 I 

AAC C AG AC T G C C T AC G G GT AT C C AG AT C C C AAG AC T AAG GAG G AAG C AG AC AT GAT T C T C 



1471 



CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

II II II I I I I I I I I I I I I I I I I I I I I I I I II II MINIM I I I I I II I III 

CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1531 

TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I I I I I I I I Mill II I I I II II II I I I I I I I I II I I I I I I I I M I II 

TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1591 

CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I M I I I I I I II I I II I II I I I I I I I I I I II I I I I I 

CGGAAT ATCTACCAGCTTT CCTTCAGACAAAAT GCAT CAGACAAGGAAATT GT GT GGGTC 1651 

AT GCGAAT CACAGT GT T T GT GTT T GGAGCAT CT GCAACAGC CAT GGC CTT GCT GAC GAAA 1200 

Ml I I I I I I III I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 

ATGAGGATCACTGTGCTTGTGTTCGGAGCATCTGCAACAGCCATGGCTTTGCTGACGAAG 1711 

ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I MM Ill II M I I I II I II I I I I I I I I M I Ml 

ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1771 

CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

II I I I II I I M II I I I I II I II I I I I I II I M I II I M II II I I I M I II 

CT G CT CT GT GT AC T CT T CAT CAAAG GAAC CAACACTT AT GGGGCAGT T GCT GGT TAT AT T 1831 

T CT GGC CT CTT C C T GAGAATAACT GGAG GGGAGC CAT AT CT GT AT CTT C AGCC CTT GAT C 138 0 

I I II II I M II M II I I I I I I II I I I I II I I II I I I II I I I I II I II III 

TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1891 

TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAA 1440 

II I I I II I I M MUM I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I M 

TT CT AC CCT GGT TAT TACT CT GACAAGAAT GGT AT AT AC AAT CAGAGGT T C CCATT T AAA 1951 

ACACTT GCC AT GGT T ACAT CAT T CTT AACCAACATT T GCAT CT C CT AT CT AGC CAAGT AT 1500 

M II I I I M II II I I I I I I I I I I I I I I II I I II I II II II I I I II II I I II 

ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 2011 

CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

M I I I I I I I M II I II I I I I I I I II I I I I M I I I I I I I I I I I I II II I I I I I I I I I 

CT ATT T GAAAGT G GAAC CT T GC CT C CAAAATT AGAT GT AT TT GAT G CT GT T GT CGCAAGG 2071 

CACAGT G AAG AAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT G AAAAT AT T AAAT T AGAT 1620 

II I II I I M I I I I I II I II I I I II I I I II I I I I II I I II I II I I I I I I I II II 

CACAGT GAAGAGAACAT GGACAAGAC CAT T CT AGT C AGAAAT GAAAAT AT CAAATT AAAT 2131 
GAACT T GCACTT GT GAAGC CAC GAC AGAGCAT GAC C CT C AGCT C AACTT T CAC C AAT AAA 1680 

I I I II I II M MIMI M II I II I I I I I I II I I I I I I II I I I II I M I I I II 

GAACT T GCACC T GT GAAAC CT C GGCAGAGC CT AAC C CT CAGT T CAACT T T CAC CAAT AAG 2191 
GAGGCCTT CCTT GAT GTT GATT CCAGTCCAGAAGGGTCT GGGACT G AAG AT AAT TT AC AG 1740 

II I I I I I II I I M II I I II I I I I II I II II I I I M I I I I I I II M I I I I I Mill 

GAGGC C CT C CT T GAT GTT GAT T CC AGT C C GGAGGG GT C T GG GACT GAAGAT AACT T ACAA 2251 



TGA 
I I I 



1743 



Db 2252 TGA 2254 
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VERSION 
KEYWORDS 
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MEDLINE 
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REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
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TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



AK034415 4306 bp mRNA linear HTC 18-SEP-2003 

Mus musculus adult male diencephalon cDNA, RIKEN full-length 
enriched library, clone : 9330188K24 product : solute carrier family 5 
(choline transporter), member 7, full insert sequence. 
AK034415 

AK034415.1 GI:26329926 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki , Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M., Aizawa,K., Nagaoka,S., Sasaki, N., 
Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H. 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M. , Nishine,T., 
Yamamoto,R., Matsumoto, H . , Sakaguchi, S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 4306) 

Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci,P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A. , Hashizume, W . , 



Carninci, P . , 

Itoh,M. , 
Harada, A. , 



Hayashida, K . , Hayatsu,N., Hiramoto, K. , Hiraoka,T., Hirozane,T. f 
Hori,F., Imotani, K. , Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M. , 
Koya,S., Kurihara,C, Matsuyama, T ., Miyazaki,A., Murata,M., 
Nakamura,M. , Nishi,K., Nomura, K., Numazaki, R. , Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H. f Sakai,C, Sakai,K., Sakazume, N . , 
Sano,H., Sasaki, D., Shibata,K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A. , Takahashi , F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki , Y . 
TITLE Direct Submission 

JOURNAL Submitted ( 16- JUL-2 001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL: http : //genome. gsc. riken. go. jp/, Tel : 81-45-503-9222, 
Fax: 81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL: http : / / genome . gsc. riken. go . jp/ 
URL: http: //f antom. gsc. riken.go.jp/ . 
FEATURES Location/Qualifiers 
source 1. .4306 

/organism="Mus mus cuius" 

/mol_type= ,, mRNA" 

/strain="C57BL/6J" 

/ db_xref ="FANTOM_DB : 9330188K24" 

/db_xref="MGI: 2398619" 

/db_xref="taxon: 10090" 

/clone="9330188K24" 

/ sex="male" 

/tissue_type="diencephalon" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="adult" 
CDS 394. .2136 

/note="unnamed protein product; putative 
solute carrier family 5 (choline transporter), member 7 
(MGD | MGI: 1927126, GB | NM_022025, evidence: BLASTN, 99%, 
match=1743) " 
/codon_start=l 
/protein_id-"BAC28702 . 1" 
/db_xref="GI: 26329927" 

/ trans lation="MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAI IV 
GGRDIGLLVGGFTMTATWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFF 
AKPMRSKGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVD 
VNISVIVSALIAILYTLVGGLYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFT 
AVHAKYQS PWLGTI ESVEVYTWLDN FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FL 
AAFGCLVMALPAICIGAIGASTDWNQTAYGYPDPKTKEEADMILPIVLQYLCPVYISF 
FGLGAVS AAVMS S ADS S I L S AS SMFARN IYQLS FRQNAS DKE I VWVMRI T VLVFGASA 
TAMALLTKTVYGLWYLS S DLVYI 1 1 FPQLLCVLFI KGTNT YGAVAGYI FGLFLRITGG 
EPYLYLQPLIFYPGYYSDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKYLFESGTLP 
P KL D VFD AWARH S E ENMD KT I LVRN EN I K LN E LAP VK P RQ S LT L S S T FT N KEAL L D V 



ORIGIN 



DSSPEGSGTEDNLQ 



Query Match 78.9%; Score 1375; DB 11; Length 4306; 

Best Local Similarity 86.8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0; 

Qy 1 AT GGCT TT C CAT GT GGAAGGACT GAT AGCT AT CAT C GT GT T CT AC CT T CTAAT T TT G CT G 60 

III I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I II II II II III 
Db 394 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 453 

Qy 61 GT T G GAAT AT GGGCTGCCTG GAGAAC C AAAAAC AGT GGC AGC GC AGAAGAG C G C AGC GAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 454 GT T GGAAT AT GGGCT GC AT G GAAAAC CAAAAAC AGC G GCAACC C AGAAGAGC GCAGT GAA 513 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 514 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 573 

Qy 181 AC C T GG GT C GGAGGAG GGT ATAT CAAT GGCACAG CT GAAGCAGT T TAT GT ACCAGGT TAT 24 0 

I I I I I I I I I I I I I I I I II I II I I I IS I II I I I I I I I I I I II I I I I I I I I I I 

Db 574 AC C T GGGTT GGAGGAG G CT ACAT CAAT GGGAC AGCAGAAGCAGT GTAT GGG CCAGGT T GT 633 

Qy 24 1 'GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 634 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 693 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II II 
Db 694 TTTTTTGCGAAACCTATGCGTTCCAAGGGATATGTGACTATGTTAGACCCATTTCAACAG 753 

Qy 3 61 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I I I I I I I I I I I I I I I I II I I II II II I I II I I I I I I I I I I I I I I I I I II 
Db 7 54 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 813 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 814 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 873 

Qy 4 81 AT GCACATTTCT GT CAT CAT CT CT GCACTCATT GCCACT CT GTACACACT GGT GGGAGGG 540 

II I I I I II I I I I I I I I II I I I I I I I I I I I I I III II II II I I I I I III 
Db 874 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 933 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I II I I I II I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 934 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 993 

Qy 601 AT C AGC GTCCCCTTTG CAT T GT C AC AT C CT GCAGT C G C AGAC AT C GGGT T C ACT G CT GT G 660 

I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 994 AT C AGT GTCCCTTTTGCCCT GT C ACAT C CT GCAGT C AC CGACAT C GGATT CACAGCT GT G 1053 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 1054 CATGCTAAATACCAGAGTCCCTGGCTGGGAACCATTGAATCAGTTGAAGTCTACACCTGG 1113 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1114 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 1173 



781 



840 



GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 

II I I I I I 11 I I II I i I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I III 
1174 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 1233 



841 



900 



T GC CT G GT GAT GGC CAT C C C AGC CAT ACT CAT T GG GGC C AT T GGAGC AT CAACAGACT GG 
I I I M I I I I I I I I I I II I I I I I I III II M I I I I I I I I II I I I I I I I I I 
1234 TGCCTGGTGATGGCTCTACCCGCCATATGCATAGGAGCTATTGGAGCTTCCACAGACTGG 1293 

901 AACCAGACTGCATAT GGGCTT CCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

I I II I I I II I I II III I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

1294 AACCAGACTGCCTACGGGTATCCAGATCCCAAGACTAAGGAGGAAGCAGACATGATTCTC 1353 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I II II I II I III 

1354 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1413 

1021 T CT GCT GCT GT TAT GT CAT C AGCAGAT T CTT CCAT CTT GT CAGCAAGT T C CAT GT TT GCA 1080 

M I I I I I I I I I I I I I I I I II II II I I I I I I II I I II I I I I I I I I I I I I I 
1414 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1473 

1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACA7VAGAAATCGTTTGGGTT 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I II I 
1474 C GGAAT AT CT ACC AGCT TT C CTT CAGACAAAAT G CAT CAGACAAGGAAATT GT GT GG GT C 1533 

1141 AT GC GAAT CACAGT GT T T GT GTTT GGAGC AT CT GCAACAGC CAT GGC CTT GCT GACGAAA 1200 

Ml I I I I I I III I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I 
1534 ATGAGGATCACTGTGCTTGTGTTCGGAGCATCTGCAACAGCCATGGCTTTGCTGACGAAG 1593 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I MINIM III 
1594 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1653 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I II I I I I I I II 

1654 CT GCTCT GT GTACTCTT CAT CAAAGGAACCAACACTTAT GGGGCAGTTGCT GGTTATATT 1713 

1321 TCTGGCCTCTTCCT GAGAATAACT GG AGGG GAGC CAT AT CT GT AT CT T CAGCC CTT GAT C 1380 

I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I III 

1714 TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1773 

1381 TT CTACCCTGGCT ATTACCCT GAT GATAAT GGTATATATAAT CAGAAAT TT CCATTTAAA 1440 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I 
1774 TT CTACCCT GGTT ATTACT CTGACAAGAAT GGT ATATACAATCAGAGGTT CCCATTTAAA 1833 

1441 AC ACT T GC CAT GGT T AC AT CAT T CT T AAC C AAC AT T T G CAT CT C CT AT CT AGC C AAGT AT 1500 

II II I I I I I I I I I I II I I I I II II I M M II M I II I I I I I II I I I II I I I 
1834 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1893 

1501 C T AT T T GAAAGT GGAAC CT T G C C AC C T AAAT TAG AT GT AT T T GAT GCTGTTGTT GC AAGA 1560 

I I II I I I I I I I I I I I II I I I I I I II I I II I I I I I II I I I I I I I I I I M I I I I I I I I 
1894 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATGTATTTGATGCTGTTGTCGCAAGG 1953 

1561 CACAGT GAAGAAAAC AT G GAT AAGAC AAT T CTT GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I II I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M 
1954 CACAGT GAAGAGAACAT GGAC AAGAC CATT CT AGT CAGAAAT GAAAAT AT CAAATTAAAT 2013 



Qy 1621 GAACTT GCACTT GTGAAGCCACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2014 GAACT T GCACCT GT GAAAC CT CGGCAGAGC CTAACC CT CAGTT CAACTT T CAC CAAT AAG 2073 

Qy 1681 GAG GC CT T C CT T GAT GT T GAT T C C AGT C C AGAAGGGT CT GG GAC T GAAGAT AAT T T AC AG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I II I I I I I I I I I 
Db 2074 GAGGC CCT CCTT GAT GTT GATT C CAGT CCGGAGGGGT CT GGGACT GAAGAT AACTTACAA 2133 

Qy 1741 TGA 1743 

III 

Db 2134 TGA 2136 
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DEFINITION 
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VERSION 
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SOURCE 
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REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
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FEATURES 

source 



gene 
ORIGIN 



AY413300 1743 bp DNA linear GSS 12-DEC-2003 

Mus musculus HCM4844 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY413300 

AY413300. 1 GI: 39769262 
GSS. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 1743) 

Clark, A. G. , Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum,D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams,M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 1743) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J . J . , 
Adams,M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualifiers 

1. .1743 

/organism="Mus musculus" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 10090" 
<1. .>1743 

/ locus__tag="HCM4 844" 



Query Match 66.3%; Score 1156.2; DB 29; Length 1743; 

Best Local Similarity 73.1%; Pred. No. 3.8e-304; 

Matches 1275; Conservative 0; Mismatches 468; Indels 0; Gaps 



0; 



Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I II II II III 

Db 1 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

Qy 61 GT T GGAATATGGGCT GCCT GGAGAACCAAAAACAGT GGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I 
Db 61 GTT GGAATAT GGGCT GCAT GGAAAACCAAAAACAGC GGCAACCCAGAAGAGCACAGT GAA 120 

Qy 121 GC C AT CAT AGT T GGT GG CC GAG AT AT T G GT T TAT T GGTT G GT G GAT T T AC CAT GAC AG CT 18 0 

I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GC CAT CAT AGT C GGGGGCC GT GACAT T GGT TT GT T GGTT GGT G GTT T T AC CAT GAC AGNN 18 0 

Qy 181 ACCTGGGT C GGAGGAGGGTATAT CAATGGCACAGCT GAAGCAGTTTATGT ACCAGGTTAT 24 0 

Db 181 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Db 241 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 300 

Qy 301 T T CT T T GCAAAACC T AT GC GT T C AAAGGGGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

Db 301 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

Db 361 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 420 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 48 0 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 421 NNNNNNNNNNNNNNNNNNNNNNNNNNNNGGGCCACCATCAGCGTGATCATTGATGTGGAT 48 0 

Qy 481 AT G C AC AT T T CT GT CAT CAT CT CT GC ACT C ATT GC C ACT CT GT ACAC ACT GGT GG GAGGG 54 0 

II MM II I II I I I II I I II II M II M II I III II II II II II I III 
Db 481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

II I I I II I I I I I I II I II I M II M II I II II I II I I II I I I I II I I II I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I II I I I I II I I I I II II I I I II I I II II I I II II II I I I I I I II I I I I 
Db 601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

Qy 661 CAT GCCAAAT AC CAAAAGC CGT GGCT GGGAACTGTT GACT CAT CT GAAGT CTACT CTT GG 72 0 

I I I I I II I I I I I I I II I M II M II I I II I I III II II I II I I I I II I 
Db 661 CAT GCTAAATAC CAGAGTCCCTGGCT GGGAACCATT GAAT CAGTTGAAGT CTACACCTGG 72 0 

Qy 721 CTT GAT AGT TTTCTGTTGTT GAT GCT G G GT GGAAT C C CAT GGC AAGC AT ACT TT C AGAGG 78 0 

I I I I I M M I I M I I I I II I I I II II I I II I I II I I II II I II II Mill I M I I I 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 78 0 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I M I I I I I I I II I I I I I II I I I I I II I I II I I I I I II M I M I I II III 

Db 781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 84 0 



Qy 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I I I I I I I I I I I I II I I I I I I III II II I I I I I II I II I I I II I I I I 

Db 841 T GC CT G GT GAT GGC T CTACC C GC CAT AT GCAT AGGAG CT AT T GGAGCT T C C ACAGACT GG 900 



Qy 901 AAC CAGACT GCAT AT GG G CT T C C AGAT C CCAAGACTACAGAAGAG GC AGAC AT GATT T TA 960 

I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

Db 901 AAC C AG AC T G C C T AC G G G TAT C C AG AT C C C AAG AC T AAG GAG G AAG C AG AC AT GAT T C T C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II II I I I I I I I I I I I Mill II II I I I I I I I I I I II I I I I I II II I I I I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 C GGAAC AT C T AC C AGCT T T C C T T C AGAC AAAAT GCT T C GGAC AAAGAAAT CGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I Mill II I I II I 
Db 1081 C GGAAT AT CTAC C AGCT T T CCTT C AGACAAAAT GCAT CAGACAAGGAAAT T GT GT GG GT C 114 0 

Qy 1141 AT GCGAAT CACAGT GTTT GT GTTTGGAGCAT CT GCAACAGC CAT GGCCTT GCT GACGAAA 1200 

III I I II I I III II I II I I I I I I I I I I I I I I I II I I M I I I I II I I I I I I I I I 

Db 1141 ATGAGGATCACTGTGCTTGTGTTCGGAGCATCTGCAACAGCCATGGCTTTGCTGACGAAG 1200 

Qy 1201 "ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I II I II I I I II M M I I I I I II I I I II I I I I I I II I I I I I II I I I I I I III 
Db 1201 ACT GT GT AT GGGCT CT G GT AC CT GAGCT CT GACC T T GT CT ACAT CAT CAT CT T C C CAC AG 1260 

Qy 12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I II I I I I I I I I I I II I I II I I I I I I I I I II I I I II I I I II II I II I II I I 
Db 12 61 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III I I I I I I I I I I I I I I I II I I I I I I I II I I I I II II I II I I I I I I III 
Db 1321 TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1380 

Qy 1381 T T CT AC C CT GGCT AT T AC C CT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I II I II I II I I I I II M I I II I I I I I I I II I II II I I I I I II I I II II II I 

Db 1381 T T CTACC CT GGTT AT TACT CT GACAAGAAT G GT ATAT ACAAT C AGAG GT T C C C AT TT AAA 1440 

Qy 1441 ACACTT GCCAT GGTT ACAT CATT CT TAACCAACATTT GCATCT CCTAT CT AGCCAAGT AT 1500 

II II M I I I I I I II I I I I I II I I I I II I II II I I II I I I I I I I I I I I I I II 

Db 1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

Qy 1501 CT ATTT GAAAGTGGAAC CTT GC CAC CTAAATTAGAT GT ATTT GAT GCT GTT GTT GCAAGA 1560 

I I I I II I II I I II I I I II I I I I I II I I I II I I I II II II I I I I II I I M II I II I I 
Db 1501 CT AT T T GAAAGT GGAAC CTT GC CT C CAAAAT T AGAT GT ATTT GAT GCT GTT GT C GCAAGG 1560 

Qy 1561 CACAGT GAAGAAAAC AT G GAT AAG AC AAT T C T T GT C AAAAAT GAAAAT AT T AAAT TAG AT 1620 

I I I I I I I I II I I II II I I I Mill Mill I I I I I I I I I I I II I II I I I I I I II 
Db 1561 CACAGT GAAGAGAACAT GGACAAGAC C ATT CT AGT CAGAAAT GAAAAT AT CAAAT T AAAT 1620 

Qy 1621 GAACT T GC ACT T GT GAAGC C AC GAC AG AG CAT GAC C CT C AG CT CAACT T T CAC CAATAAA 1680 

I I M I I I I II II I I I I M M II II I I I I I I I I I II I I I I M M I I I I M I I I 

Db 1621 GAACTTGCACCTGTGAAACCTCGGCAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAG 1680 

Qy 1681 GAGGCCT T CCT T GATGT T GAT T C C AGT C CAGAAGGGT CT G G GACT GAAGAT AAT T T AC AG 174 0 



Db 1681 GAGGC CCT C CT T GAT GT T GAT T CCAGT C C G GAGG G GT CT GG GACT GAAGAT AACT T ACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 6 
AG157499 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



AG157499 672 bp DNA linear GSS 09-JAN-2002 

Pan troglodytes DNA, clone: RP43-022H02 . T7 , genomic survey 
sequence . 
AG157499 

AG1574 99.1 GI: 16687177 
GSS. 

Pan troglodytes (chimpanzee) 
Pan troglodytes 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pan. 

1 

Fujiyama, A., Hattori,M., Toyoda,A. , Taylor, T. D. , Yada,T., 

Totoki,Y., Watanabe,H. and Sakaki,Y. 

BAC end sequences of Library RPCI-43 

Unpublished 

2 (bases 1 to 672) 

Fujiyama, A., Hattori,M., Toyoda,A., Taylor, T.D., Yada,T., 
Totoki,Y., Watanabe,H. and Sakaki,Y. 
Direct Submission 

Submitted ( 02-AUG-2001 ) Asao Fujiyama, The Institute of Physical 
and Chemical Research (RIKEN) , Genomic Sciences Center (GSC) ; 
1-7-22 Suehiro-chou, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 
(E-mail : chimpbes@gsc . riken . go . jp, URL : http : //hgp . gsc . riken . go . jp/ , 
Tel: 81-45-503-9111, Fax:81-45-503-9170) 

Clones are derived from the chimpanzee BAC library RPCI-43 This BAC 
end was generated during the R&D process and may have higher chance 
of clone tracking errors. 
PRIMERS 

Sequencing: T7 
LIBRARY 

Vector : pBACe3 . 6 
R.Site 1 : EcoRI 
R.Site 2 : EcoRI. 

Location/Qualif iers 
1. .672 

/organism="Pan troglodytes" 
/ mo l_type=" genomic DNA" 
/db_xref="taxon: 9598" 
/clone="RP43-022H02.T7" 
/sex="male" 

/cell_type="lymphocytes" 

/clone_lib="RPCI-43 Chimpanzee Male BAC Library" 



ORIGIN 



Query Match 29.8%; Score 518.8; DB 29; Length 672; 

Best Local Similarity 97.3%; Pred. No. 3.5e-130; 

Matches 54 9; Conservative 0; Mismatches 12; Indels 3; Gaps 2; 



Qy 1110 AAAT GCTT CGGACAAAGAAAT C GTTT GGGTT AT G C GAAT CACAGT GT T T GT GT T T GGAGC 1169 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 111 ACAGGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 170 



Qy 1170 ATCTGC7\ACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

Db 171 ATCTGCAACAGCCATGGCCTTGCTGACGAAGACTGTGTATGGGCTCTGGTACCTCAGTTC 230 

Qy 1230 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 1289 

I I 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 I 1 1 1 1 I I I i 1 1 I I I 1 1 I I I 1 1 1 1 I I 1 1 1 1 I 1 1 I 1 1 1 1 I I 

Db 231 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 290 

Qy 1290 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 134 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 291 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 350 

Qy 1350 G GAGC C AT AT CT GT AT CT T CAGC C CT T GAT CT T CT AC C CT GG CT AT T AC C CT GAT GAT AA 14 09 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

Db 351 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 410 

Qy 1410 T GGT AT AT AT AAT C AGAAAT T T C CAT T T AAAAC ACT T GC C AT GGT T AC AT CAT T CT T AAC 14 69 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 411 TGGTATATATAATCAGAAATTTCCATTTAAAACACTTGCCATGGTTACGTCATTCTTAAC 470 

Qy 1470 C AAC AT T T GC AT CT C CT AT CT AGC C AAGT AT CT AT T T G- AAAGT GGAAC CT T GC CAC CT A 1528 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 
Db 471 CAACATTTGCGTCTCCTATCTAGCCAAATATCTATTTGAAAAGTGGAACCTTGCCACCTA 530 

Qy 1529 AATTAGATGTATTTGATGCTGTTGTTGCAAGACACAGTGAAGAAAACATGGATAAGACAA 1588 

I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 531 AATTAGATGTATTTGATGCTGTTGTTGCAAGACACAGTGAAGAAAACATGGATAAGACAA 590 

Qy 1589 TTCTT GT C AAAAAT GAAAAT ATT AAATT AGAT GAACTT G CACTT GT GAAGC C ACGAC AGA 164 8 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 591 TTTTTGTCAAAAATGAAA — TATAAATTAGATGACCTTGCACTTGTGAAGCCACGACAGA 64 8 

Qy 164 9 GCATGACCCTCAGCTCAACTTTCA 1672 

I I I I I I I I I I I I I I I I I I I I I I 

Db 64 9 AC AT GAC C C T C AG C T T AAC T T T C A 672 



RESULT 7 
CD350164 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



mRNA linear 
FY0 Mus musculus 



EST 09-JUL-2003 
cDNA clone 



CD350164 707 bp 

UI-M-FY0-cfl-h-10-0-UI . rl NIH_BMAP 
IMAGE: 6851099 5*, mRNA sequence. 
CD350164 

CD350164.1 GI:31141679 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 707) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 



Craniata; Vertebrata; 
Sciurognathi; Muridae; 



Euteleostomi; 
Murinae; Mus. 



JOURNAL Unpublished (1999) 
COMMENT Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Scares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : //genome . uiowa . edu/dis tribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 
FEATURES Location/Qualifiers 
source 1. .707 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 6851099" 
/tissue_type="whole brain" 

/dev_stage="embryo 13 . 5, 14 . 5, 16 . 5, 17 . 5dpc" 
/lab_host="DHl0B (Tl phage resistant)" 
/ clone_lib="NIH_BMAP_FY0 " 

/note="Organ: Brain; Vector: pYX- Asc; Site_l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is AGCGAGACAG. This library was created for the University 
Iowa Brain Anatomy Project (BMAP): 1 Gene Discovery in the 
Developing Mouse Nervous System 1 , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 

ORIGIN 

Query Match 27.1%; Score 472.2; DB 14; Length 707; 

Best Local Similarity 86.2%; Pred. No. 1.9e-117; 

Matches 580; Conservative 0; Mismatches 88; Indels 5; Gaps 5; 



Qy 1 AT G GCT T T CC AT GT GGAAGGACT GAT AG CT AT CAT CGT GT T CT AC CT T CT AATT T T GCT G 60 

IN I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I II M I I I 

Db 27 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 86 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I i MM II I I I II I I I I I I II I I I II I I M I M I I I III 

Db 87 GTTGGAATATGGGCTGCATGGAAAACCAAAAACAGCGGCAACCCAGAAGAGCGCAGTGAA 146 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I II Mill II II I II I M I I M I I I I II I II I I I M I II II I I 

Db 147 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 206 



Qy 181 AC CT GGGTC GGAGGAGGGTATAT CAAT GGCACAGCT GAAGCAGTTTAT GTACCAGGTTAT 240 



Db 



I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 07 ACCTGGGTTGGAGGAGGCTACATCAATGGGACAGCAGAAGCAGTGTATGGGCCAGGTTGT 2 66 



Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 267 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 326 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II Mill I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 327 TTTTTTGCGA/yVCCTATGCGTTCCAAGGGATATGTGACTATGTTAGACCCATTTCAACAG 386 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGG-AGAAATGTT 419 

I I I I I I I I I I I MINIM II II II II || I I I I I I I I I I I I || Ml | | | M 
Db 387 ATCTATGG7\AAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGNAGAGATGTT 446 

Qy 42 0 CTGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCA-TCAGCGTGATCA-TCGATGTG 4 77 

I I II I I I I I I I I I I I I I I I I I I II II II I I II I I I I I I I I I I I I I I I I MINI 
Db 447 CTGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATTCAGCGTGATCATTGGATGTG 506 

Qy 47 8 GATATGCACATTT CTGTCAT CAT CT CT GCACT CATT GC CACT CTGT ACACACT GGTGGGA 537 

III II I I I I II I I I I I I I I I I I I I I I I I I I I I I I III II II II I I I I I 

Db 507 GAT GT GAACAT AT C GGT CAT T GT CT CT GCACT CAT T GC CAT T CTT T ATAC C CTAGT GGGT 566 

Qy 538 GGGCTCTATTCTGTGGCCTACACT-GATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCT 59 6 

I I I I I I I I I I I I I I I I II III I I I II II Mill I I I I I I I I I I I I I I I II 

Db 567 GGGCTCTACTCTGTGGCATATACTGGATGTTGTCCAGCTATTCTGCATTTTTATAGGACT 626 

Qy 597 GT-GGATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTG 655 

II I II II I I Mill Mill I I I II II I I I II II I I I I I I II II I I I I II I I 

Db 627 GT GG GAT C AGT GTCCCTTTTGCCCTGT C ACAT C CT GC AGT C AC C GACAT C GGAT T C AC AG 686 

Qy 656 CTGTGCATGCCAA 668 

I II I I I M I I II 
Db 687 CTGTGCATGCTAA 699 



RESULT 8 
BY727598 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



BY727598 669 bp mRNA linear EST 17-DEC-2002 

BY727598 RIKEN full-length enriched, 6 days neonate medulla 
oblongata Mus musculus cDNA clone B730003H24 5', mRNA sequence. 
BY727598 

BY727598. 1 GI: 27140725 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 669) 

Okazaki,Y., Furuno,M., Kasukawa,T., Adachi,J., Bono,H., Kondo,S., 
Nikaido,I., Osato,N., Saito,R., Suzuki, H., Yamanaka,I., 
Kiyosawa,H., Yagi,K., Tomaru,Y., Hasegawa,Y., Nogami,A., 
Schonbach, C. , Gojobori,T., Baldarelli, R. , Hill, D. P., Bult,C, 
Hume, D. A., Quackenbush, J. , Schriml, L.M. , Kanapin r A., Matsuda,H., 
Batalov,S., Beisel,K.W., Blake, J. A., Bradt,D., Brusic,V. , 
Chothia,C, Corbani, L . E . , Cousins, S., Dalla,E., Dragani, T . A. , 
Fletcher, C. F. , Forrest, A. , Frazer, K. S . , Gaasterland, T . , 



Gariboldi, M. , Gissi,C, Godzik,A., Gough,J., Grimmond,S., 
Gustincich, S. , Hirokawa,N., Jackson, I . J. , Jarvis,E.D., Kanai,A., 
Kawaji,H., Kawasawa,Y. f Kedzierski, R.M. , King,B.L., Konagaya,A., 
Kurochkin, I .V. , Lee,Y., Lenhard,B., Lyons, P. A., Maglott , D . R. , 
Maltais,L., Marchionni , L . , McKenzie,L., Miki,H., Nagashima, T . , 
Numata,K., Okido,T., Pavan,W.J., Pertea,G., Pesole,G., 
Petrovsky,N. , Pillai,R., Pontius , J . U . , Qi,D., Ramachandran, S . , 
Ravasi,T., Reed, J. C, Reed, D. J., Reid,J., Ring,B.Z., Ringwald,M., 
Sandelin,A. , Schneider , C . , Semple, C . A. , Setou,M., Shimada, K . , 
Sultana, R. , Takenaka,Y., Taylor,M.S., Teasdale, R. D . , Tomita,M. , 
Verardo,R., Wagner, L., Wahlestedt, C . , Wang,Y., Watanabe,Y., 
Wells, C, Wilming,L.G. , Wynshaw-Boris , A. , Yanagisawa, M. , Yang, I., 
Yang,L., Yuan,Z., Zavolan,M., Zhu,Y., Zimmer , A. , Carninci,P., 
Hayatsu,N. , Hirozane-Kishikawa, T . , Konno,H. , Nakamura,M. , 
Sakazume,N. , Sato,K., Shiraki,T., Waki,K., Kawai,J., Aizawa,K., 
Arakawa,T., Fukuda,S., Hara,A., Hashizume, W. , Imotani, K. , Ishii,Y., 
Itoh,M., Kagawa,I., Miyazaki,A. , Sakai,K., Sasaki,D., Shibata,K., 
Shinagawa,A. , Yasunishi , A. , Yoshino,M., Waterston, R. , Lander, E.S., 
Rogers, J., Birney,E. and Hayashizaki, Y. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 

MEDLINE 22354683 
PUBMED 12466851 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 
Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res@gsc. riken. go . jp, 

URL: http : //genome . gsc. riken. go. jp/ 

Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Carninci,P., 
Fukuda,S., Hashizume, W . , Hayashida , K . , Hirozane,T., Hori,F., 
Imotani, K., Ishii,Y., Itoh,M., Kagawa,I., Kawai,J., Kojima,Y., 
Kondo,S., Konno,H. , Koya, S . , Miyazaki,A., Murata,M. , Nakamura,M., 
Nomura, K., Numazaki,R., Ohno,M. , Ohsato,N., Saito,R., Sakazume,N., 
Sano,H., Sasaki, D. , Sato,K., Shibata,K., Shiraki,T., Tagami,M., 
Takeda,Y., Waki,K., Watahiki,A., Muramatsu,M. and Hayashizaki, Y. 
Direct Submission 

Computational Analysis of Full-Length Mouse cDNAs Compared with 
Human Genome Sequences Mamm. Genome. 12, 67 3-677 (2 001) 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. 10 (10), 1617-1630 (2000) 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. 11 (2), 281-289 (2001) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 



FEATURES 

source 



Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

Location/Qualifiers 
1. .669 

/organism="Mus musculus" 
/mol_type= ,, mRNA" 
/db_xref="taxon: 10090" 
/clone="B730003H24" 
/tissue_type="medulla oblongata" 
/dev__stage=" 6 days neonate" 
/lab_host="DH10B" 

/clone_lib="RIKEN full-length enriched, 6 days neonate 
medulla oblongata" 

/note="Site_l: Sail; Site_2 : BamHI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

GAG AG AG AG AAG GAT C C AAG A G CTCTTTTTTTTTTTTTTTT VN 3 1 ], cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. cDNA went through one round of normalization 
to Rot =20.0 and subtraction to Rot = 459.0. Second 
strand cDNA was prepared with the primer adapter of 
sequence [5 1 GAGAG AG AGAT T CT C GAGT T AAT T AAAT T AAT CCCCCCCCCCCCC 
3 1 ]. cDNA was cleaved with Xhol and BamHI. Vector: a 
modified pBluescript KS(+) after bulk excision from Lambda 
FLC I." 



ORIGIN 



Query Match 26.6%; 
Best Local Similarity 86.7%; 
Matches 509; Conservative 



Score 462.8; DB 13; 
Pred. No. 6.8e-115; 
0; Mismatches 78; 



Length 669; 
Indels 0; 



Gaps 



0; 



Qy 

Db 



81 



ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I II I I I I I I I I I I IN I I I I I I I II M M II III 

ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 140 



Qy 



Db 



61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I II II I I I I I I I I I I I I I I Ml 

141 GTT GGAAT AT G GGCT GCAT GGAAAAC CAAAAAC AGC GGC AAC C CAGAAGAGC GC AGT GAA 200 



Qy 

Db 

Qy 

Db 

Qy 

Db 



121 G CCAT CAT AGT T G GT GGC C GAGAT ATT GGTT T AT T GGT T G GT G GAT TTAC CAT G ACAGCT 18 0 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

201 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 2 60 

181 ACC T GG GT CGGAGGAGGGT AT AT CAAT GGC AC AG CT GAAG C AGT TTAT GTAC CAGGT TAT 24 0 

I I I II I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

261 ACCT GGGTT GGAGGAGGCT ACAT CAAT GGGACAGCAGAAGCAGT GT AT GGGC CAGGTT GT 320 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I III 

321 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 38 0 



Qy 



301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 



Db 



381 



440 



Qy 361 AT CTAT GGAAAACGCATGGGCGGACT CCT GTTTATT CCT GCACTGAT GGGAGAAAT GTT C 420 

I I I I I I I I I I I IIMMII I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 441 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 500 

Qy 421 T GGGCT GC AGCAAT TTTCTCTGCTTT GG GAGC C AC CAT C AG C GT GAT CAT C GAT GT G GAT 48 0 

I I I I I I I I II I I I I I I II I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 501 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 560 

Qy 481 AT GCACATTTCT GT CAT CAT CT CT GCACT CATT GCCACT CT GT ACACACT GGT GGGAGGG 540 

I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 561 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 62 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTT 587 

I I I I I IIMMII II IIMMII II II I II II I II I I I I 

Db 621 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTNT 667 



RESULT 9 
BE233479 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



BE233479 516 bp mRNA linear 

139685 MARC 1PIG Sus scrofa cDNA 5', mRNA sequence. 
BE233479 

BE233479.1 GI:9018197 
EST. 

Sus scrofa (pig) 
Sus scrofa 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 
Mammalia; Eutheria; Cetartiodactyla; Suina; Suidae; 
1 (bases 1 to 516) 

Fahrenkrug, S.C. , Smith, T . P . L. , Freking, B.A. , Cho, J. , 



EST 10-JUL-2000 



Euteleostomi ; 
Sus . 

White, J. , 



FEATURES 

source 



Vallet,J., Wise,T., Rohrer,G.A., Pertea,G., Sultana, R., 
Quackenbush, J. and Keele,J.W. 

Porcine gene discovery by normalized cDNA-library sequencing and 

EST cluster assembly 

Mamm. Genome 13 (8), 475-478 (2002) 

22213789 

12226715 

Contact: Smith TPL 

USDA, ARS, US Meat Animal Research Center 
PO Box 166, Clay Center, NE 68933-0166, USA 
Tel: 402 762 4366 
Fax: 402 762 4390 

Email: smith@email.marc.usda.gov 

Single pass sequencing. Bases called and alt__trimmed with phred 
vO. 980904. e. Vector identified by cross_match with the -minscore 18 
and -minmatch 12 options . 
PCR PRimers 

FORWARD: AGGAAACAG CTAT GAC CAT 
BACKWARD: GTT T T C C CAGT C AC GAC G 
Plate: 75 row: G column: 12 
Seq primer: ATTTAGGTGACACTATAG . 

Location/ Qualifiers 

1. .516 

/organism="Sus scrofa" 



/mol_type="mRNA" 
7db_xref="taxon: 9823" 
/ tissue_type="pooled" 
/lab_host="DH10B" 
/clone__lib="MARC 1PIG" 

/note="Vector: pCMV SPORT 6; Site_l: NotI; Site_2: Sail; 
Library made from pooled tissue from day 11, 13, 15, 20, 
and 30 embryos." 



ORIGIN 



Query Match 2 3.2%; 

Best Local Similarity 86.4%; 
Matches 446; Conservative 



Score 404; DB 10; 
Pred. No. 7.2e-99; 
0; Mismatches 70; 



Length 516; 



Indels 



0; Gaps 



0; 



Qy 1067 GTTCCATGTTTGCACGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAG 1126 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I 
Db 1 GT T C TAT GT T T GC T AGAAACAT CT AG C AG CT CT CAT T C AGAC AAAAC GCT T CC GAC AGGG 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1127 AAAT CGTTTGGGT TAT G C GAAT C AC AGT GTTTGTGTTTG GAGC AT C T G C AAC AG C CAT G G 118 6 
I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I II I I I I I I I I I I I I I I I I 
61 AGAT C GT CT GGGT CAT GC G GAT C AC AGT AT T C GT GT T T GGT GC GT CT G CAACAGCC AT GG 120 

1187 CCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCG 1246 
I I I I I I I I I I II 11 I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I 
121 CCTTGCTGACCAAGACCGTGTATGGGCTCTGGTACCTCAGCTCCGACCTCGTCTACATCA 



1247 



181 



180 



TTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCG 130 6 
I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
TTATCTTCCCGCAGCTGCTCTGTGTGCTCTTCATCAAGGGGACCAACACGTACGGGGCCG 



240 



1307 TGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATC 1366 
MINIMI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 TGGCAGGGTACATTGCTGGCCTTTTCCTGAGGGTAACCGGTGGAGAGCCATACCTGAACC 300 



1367 TTCAGCCCTT GAT CTT CTACCCT GGCTATT ACCCTGAT GATAAT GGTATAT ATAATCAGA 142 6 
I I I I I I I I I I I I I II I I I 1 I I II I I I I I I III I I I I I I I I I I I I I I I i I I I I 
TGCAGCC CTT GAT CTTTTAC CCT GGTTATTACGTTGAAAAAAAT GGTATAT ATAATCAGA 360 



301 



1427 AAT T T C CAT T T AAAAC AC T T G C CAT GGT T AC AT CAT T C T T AAC C AAC AT T T G CAT C T C C T 14 8 6 
Ml I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I 
361 GATTCCCATTTAAAACCCTTGCCATGCTCACCTCCTTCTTATCCAACATTTGCATCTCTT 420 

1487 AT C TAG C C AAG TAT C TAT T T G AAAGT G G AAC C T T G C C AC C T AAAT TAG AT GT AT T T GAT G 154 6 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 AT CT AG C C AAAT AT CT AT T T GAAAGT G GAAC CT T GC C AC C AAAAT T AGAT AT GT T T GAT G 480 

1547 C T GT T GT T GC AAGACAC AGT GAAGAAAAC AT G GAT A 1582 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
4 81 CT GT T GT T GCAAGACAC AGT GAAGAAAAC AT G GAT A 516 



RESULT 10 

BB626260 

LOCUS 

DEFINITION 



BB626260 650 bp mRNA linear EST 26-OCT-2001 

BB626260 RIKEN full-length enriched, adult male diencephalon Mus 
musculus cDNA clone 9330170D24 5', mRNA sequence. 



ACCESSION BB6262 60 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BB626260 .1 GI : 16464298 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae.; Murinae; Mus. 
1 (bases 1 to 650) 

Arakawa,T., Carninci,P., Fukuda,S., Furuno,M. , Hanagaki,T., 
Hara,A., Hiramoto,K., Hori,F., Ishii,Y., Ito,M., Kawai,J., 
Konno,H., Kouda,M., Koya,S., Matsuyama, T . , Miyazaki,A. , Nomura, K. , 
Ohno,M w Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagami,M., Tagawa,A. , Takahashi, F. , 
Takeda,Y., Tanaka,T., Toya,T., Muramatsu,M. and Hayashizaki, Y. 
RIKEN Mouse ESTs (Arakawa,T., et al. 2001) 
Unpublished (2001) 
Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res@gsc . riken. go . jp, 

URL:http: / /genome. gsc. riken. go. jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y . 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Kondo,S., Shinagawa, A. , Saito,T., Kiyosawa,H., Yamanaka,I., 
Aizawa,K., Fukuda,S., Hara,A. , Itoh,M., Kawai,J., Shibata,K. and 
Hayashizaki, Y. 

Computational Analysis of Full-Length Mouse cDNAs Compared with 
Human Genome Sequences. Mamm. Genome. 12, 673-677 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details, 
e mouse tissues. 

Location/Qualifiers 
1. .650 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="taxon: 10090" 
/clone="9330170D24" 



/sex- "male" 

/tissue_type="diencephalon" 

/dev_stage="adult" 

/lab_host="DH10B" 

/clone_lib="RIKEN full-length enriched, adult male 
diencephalon" 

/note="Site_l: Sail; Site_2 : BamHI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

GAGAGAGAGAAGGATCCAAGAGCTCTTTTTTTTTTTTTTTTVN 3 1 ] , cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. cDNA went through one round of normalization 
to Rot = 10.0 and subtraction to Rot = 185.0. Second 
strand cDNA was prepared with the primer adapter of 
sequence [ 5 1 GAG AG AG A GAT T C T C GAG T T AAT T AAAT T AAT CCCCCCCCCCCCC 
3']. cDNA was cloned into the Xhol and BamHI sites. 
Vector: a modified pBluescript KS(+) after bulk excision 
from Lambda FLC I. Cloning sites, 5' end: Sail; 3 1 end: 
BamHI " 



ORIGIN 



Query Match 18.9%; Score 329.8; DB 10; Length 650; 

Best Local Similarity 86.0%; Pred. No. 1.4e-78; 

Matches 375; Conservative 0; Mismatches 60; Indels 1; Gaps 1; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I I I I I I I I I I I I I I I I I I I III I II I I I I I I II II II III 

Db 201 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 260 

Qy 61 GT T GGAAT AT GG GCT GCCT GGAGAAC C AAAAACAGT GGC AGC GCAGAAGAGC G CAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MM I I II M I II I I M I III' 
Db 261 GT T GGAAT ATGGGCT GC AT GGAAAAC CAAAAACAGC GGCAAC C CAGAAGAGC GC AGT GAA 32 0 

Qy 121 G C CAT CAT AGT T GGT GGC C GAGAT AT T G GT T TAT T GGT T GGT G GAT T T AC CAT GAC AG CT 18 0 

I I I I I II I I M II I I I I I M II II I M I I I M II I II II I II I II M II I M I 

Db 321 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 38 0 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 2 40 

I I II I I M I I I I II I I I I I M II I I I M M I II II II II I II I I M I I II I 

Db 381 AC CT GGGTT GGAGGAGGCT ACATCAATGGGACAGCAGAAGCAGTGTAT GGGCCAGGTT GT 44 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I II I I I Mill I I I I I II I I I I M II I II I I I I I I I I II II M I I I Ml 

Db 441 GGTCTAGCTTGNGCTCT^GCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 500 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I I I I M II I I II II Mill I I II I II I I I I II M M I I I I I I I I! 
Db 501 TTTTTTGC - GAAC CT AT G CN GT C C AAGGGAT AT GT GACT AT GT T AGAC C CAT T T CAAC AG 559 

Qy 361 AT CT AT GGAAAACGCAT G GGC GGACT C CT GT T TAT T C CT GCACT GAT GG GAGAAAT GT T C 420 

I I I II I I I I M I I I I II I I II II II M II I II I I M II I II II I I MUM 
Db 560 AT CT AT G GAAAG CGCAT GGGT GGGCT GCT C T T C AT CC CT G C C CT GAT G GGAGAGAT GT T C 619 



Qy 



Db 



421 TGGGCTGCAGCAATTT 436 

I I I I I I I I I I I I I I I 
620 TGGGCTGCAGCCATTT 635 



RESULT 11 

AW668962 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



AW668962 541 bp mRNA linear EST 25-APR-2001 

111664 MARC 1BOV Bos taurus cDNA 5', mRNA sequence. 

AW668962 

AW668962.1 GI:7525476 
EST. 

Bos taurus (cow) 
Bos taurus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 
Bovidae; Bovinae; Bos. 
1 (bases 1 to 541) 

Smith, T. P. L. , Grosse,W.M., Freking, B . A. , Roberts , A. J . , Stone, R.T., 
Casas,E., Wray,J.E., White, J., Cho,J., Fahrenkrug, S . C . , 
Bennett, G.L. , Heaton,M.P. , Laegreid, W. W. , Rohrer,G.A. , 
Chitko-McKown,C.G. , Pertea,G., Holt, I., Karamycheva, S . , Liang, F., 
Quackenbush, J. and Keele,J.W. 

Sequence evaluation of four pooled-tissue normalized bovine cDNA 

libraries and construction of a gene index for cattle 

Genome Res. 11 (4), 626-630 (2001) 

21180013 

11282978 

Contact: Smith TPL 

USDA, ARS, US Meat Animal Research Center 
PO Box 166, Clay Center, NE 68933-0166, USA 
Tel: 402 762 4366 
Fax: 402 762 4390 

Email: smith@email.marc.usda.gov 

Single pass sequencing. Bases called and alt__trimmed with phred 
vO. 980904. e. Vector identified by cross_match with the -minscore 18 
and -minmatch 12 options . 
PCR PRimers 

FORWARD: AGGAAACAGCTATGACCAT 
BACKWARD: GTTTTCCCAGTCACGACG 
Plate: 95 row: L column: 20 
Seq primer: ATTTAGGTGACACTATAG. 

Location/Qualifiers 
1. .541 

/organism="Bos taurus" 
/mol_type="mRNA" 
/db_xref= n taxon:9913" 
/ 1 is sue__type= "pooled" 
/lab_host="DH10B" 
/clone_lib="MARC 1BOV" 

/note="Vector : pCMV SPORT 6; Site_l: NotI; Site_2 : Sail; 
Library made from pooled tissue from lymph node, ovary, 
fat, hypothalamus, and pituitary." 



ORIGIN 



Query Match 



17.9%; Score 312.8; DB 10; Length 541; 



Best Local Similarity 85.0%; Pred. No. 5.9e-74; 

Matches 350; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 



Qy 889 T CAAC AGAC T GGAAC CAGACT GCAT AT GGGCT T C CAGAT C CC AAGACTACAGAAGAG GC A 94 8 

I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 130 TCAACAGCCTGGAACCAGACTGCATACGGGCCTCTTGCTCCCAGGGAGAAACAGGAGGCA 18 9 

Qy 949 GACATGATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGT 1008 

I I I I I I I I II II I I II I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 190 GACATGATCTTGCCGATTGTCCTCAAGTATCTCTGCCCCGTGTACATTTCTTACTTTGGT 249 

Qy 1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 250 CTTGGAGCCGTTTCTGCTGCTGTCATGTCCTCAGCAGATTCTTCCATCTTGTCAGCAAGT 309 

Qy 1069 T C CAT GT T T GC AC GGAAC AT CTAC C AGCT T T C C T T C AGAC AAAAT GCT T C GGAC AAAGAA 1128 

II I I I I II I I II I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I II I II 

Db 310 T C GAT GTTTGCTC GCAAC AT CTAC C AGCT T T CAT T C AGAC AAAAT GCT T C T GAC AAGGAG 369 

Qy 1129 ATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCC 1188 

II II I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 37 0 ATAGTCTGGGTCATGCGCATCACGGTATTTGTGTTTGGAGCTTCTGCGATGACCATGGCC 429 

Qy 1189 TTGCTGACGAAAACTGTGTATGGGCTCTG'GTACCTCAGTTCTGACCTTGTTTACATCGTT 1248 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 430 TT G CTAAC GAAGACGGT GT AT GGGCT CT GGT AC CT CAGCT CT GAC CT GGT CT ACAT CAT C 489 

Qy 124 9 ATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATG 1300 

I I I I I I I I III I I I I II II I II I I I I I I I I I I I I I I I I I I I I 

Db 490 ATCTTCCCGCAGTTGCTCTGCGTGCTCTTCATCAAGGGTACCAACACGTATG 541 



RESULT 12 

BY729567 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



BY729567 675 bp mRNA linear EST 17-DEC-2002 

BY729567 RIKEN full-length enriched, 12 days embryo spinal cord Mus 
musculus cDNA clone C530033E06 5 T , mRNA sequence. 
BY729567 

BY729567. 1 GI : 27142694 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 675) 

Okazaki,Y., Furuno,M., Kasukawa,T., Adachi,J., Bono,H., Kondo,S., 
Nikaido,I., Osato,N., Saito,R., Suzuki, H., Yamanaka,I., 
Kiyosawa,H., Yagi,K., Tomaru,Y., Hasegawa,Y., Nogami / A. / 
Schonbach,C. , Gojobori,T., Baldarelli, R. , Hill, D. P., Bult,C, 
Hume, D. A., Quackenbush, J . , Schriml, L.M. , Kanapin,A., Matsuda,H., 
Batalov, S., Beisel,K.W., Blake, J. A., Bradt,D., Brusic,V., 
Chothia,C, Corbani, L . E . , Cousins, S., Dalla,E., Dragani, T . A. , 
Fletcher, C. F. , Forrest, A., Frazer,K.S., Gaasterland, T . , 
Gariboldi,M. , Gissi,C, Godzik,A., Gough,J., Grimmond,S., 
Gustincich, S . , Hirokawa,N., Jackson, I . J. , Jarvis,E.D., Kanai,A., 
Kawaji,H., Kawasawa,Y., Kedzierski, R.M. , King,B.L., Konagaya,A., 
Kurochkin, I . V. , Lee,Y., Lenhard,B., Lyons, P. A., Maglott , D . R. , 



Maltais,L., Mar chionni , L . , McKenzie,L., Miki,H., Nagashima, T . , 
Numata,K., Okido,T., Pavan,W.J., Pertea,G., Pesole,G., 
Petrovsky,N. , Pillai,R., Pontius, J. U. , Qi,D., Ramachandran, S . , 
Ravasi,T., Reed, J. C, Reed, D. J., Reid,J., Ring,B.Z., Ringwald, M. , 
Sandelin,A., Schneider, C, Semple,C.A., Setou,M., Shimada,K., 
Sultana, R., Takenaka,Y., Taylor,M.S., Teasdale, R. D Tomita,M., 
Verardo,R., Wagner, L., Wahlestedt, C . , Wang,Y., Watanabe,Y., 
Wells, C, Wilming, L. G. , Wynshaw-Boris , A. , Yanagisawa,M. , Yang, I., 
Yang,L., Yuan, Z . , Zavolan,M., Zhu,Y., Zimmer,A., Carninci,P., 
Hayatsu,N., Hirozane-Kishikawa, T . , Konno,H., Nakamura,M., 
Sakazume,N. , Sato,K., Shiraki,T., Waki,K., Kawai,J., Aizawa,K., 
Arakawa,T., Fukuda,S., Hara, A. , Hashizume, W . , Imotani,K., Ishii,Y., 
Itoh,M., Kagawa,I., Miyazaki,A., Sakai,K., Sasaki, D., Shibata,K., 
Shinagawa, A. , Yasunishi, A. , Yoshino,M., Waterston, R. , Lander, E.S., 
Rogers, J., Birney,E. and Hayashizaki, Y. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,77 0 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 

MEDLINE 22354683 
PUBMED 12466851 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 
Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome- res @gs c . riken. go . jp, 
URL: http : / /genome . gsc . riken. go . jp/ 

Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Carninci,P., 
Fukuda,S., Hashizume, W . , Hayashida, K. , Hirozane,T., Hori,F., 
Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kawai,J., Kojima,Y., 
Kondo,S., Konno,H., Koya,S., Miyazaki,A., Murata,M., Nakamura,M., 
Nomura, K., Numazaki,R., Ohno,M., Ohsato,N., Saito,R., Sakazume,N., 
Sano,H., Sasaki, D., Sato,K., Shibata,K., Shiraki,T., Tagami,M., 
Takeda,Y., Waki,K., Watahiki,A., Muramatsu,M. and Hayashizaki, Y. 
Direct Submission 

Computational Analysis of Full-Length Mouse cDNAs Compared with 
Human Genome Sequences Mamm. Genome. 12, 673-677 (2001) 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res . 10 (10), 1617-1630 (2000) 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. 11 (2), 281-289 (2001) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 
FEATURES Location/Qualifiers 
source 1. .675 



/organism="Mus mus cuius" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="C530033E06" 
/tissue_type=" spinal cord" 
/dev_stage="12 days embryo" 
/lab__host="DHlOB" 

/clone_lib="RIKEN full-length enriched, 12 days embryo 
spinal cord" 

/note="Site_l: Sail; Site_2 : BamHI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

G AG AG AGAG AAG GAT C C AAG AG CTCTTTTTTTTTTTTTTT T VN 3 T ] , cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. Second strand cDNA was prepared with the 
primer adapter of sequence [5 f 

G AGAG AGAG AT T C T C G AGT T AAT T AAAT T AAT CCCCCCCCCCCCC 3 ' ] . cDNA 
was cleaved with Xhol and BamHI. Vector: a modified 
pBluescript KS(+) after bulk excision from Lambda FLC I." 



ORIGIN 



Query Match 16.6%; Score 290; DB 13; Length 675; 

Best Local Similarity 86.5%; Pred. No. l.le-67; 

Matches 320; Conservative 0; Mismatches 50; Indels 0; Gaps 0; 

T C AAC AGACT G GAAC C AGACT G CAT AT G GG CT T C C AGAT C C C AAGACT AC AGAAGAGGC A 94 8 

II I I I I I I I I II I I I I I I I I II II Ml II I I I I I I I I I I I I I I I II II III 

T C C AC AG AC T G GAAC C AG AC T G C C T AC G G GT AT C C AG AT C C C AAG AC T AAG GAG G AAG C A 3 65 

GACATGATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGT 1008 

I M I I I I I I I II M I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I 

GACATGATTCTCCCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGG 42 5 

CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M II MINI MM II III 

CTTGGTGCTGTTTCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGT 48 5 
T C CAT GT T T G C AC G GAAC AT C T AC C AG CTTTCCTT C AG AC AAAAT G C T T C G G AC AAAGAA 1128 

II I II I II I I I I II I I I I I I I I I M I I M I M I M II II I I I M II Mill III 

T CTAT GTT TGCT CGGAATATCTACCAGCTTTCCTT CAGACAAAAT GCAT CAGACAAGGAA 545 

AT C GT T T GGGT TAT GC GAAT C AC AGT GT T T GT GTT T GGAGC AT CT GCAAC AGC CAT G GC C 1188 

II M Mill III I I II I II I I I II I I II II II II I II II II II II 

ATT GT GT GGGT CAT GAGGAT CACT GT GCTT GT GTT CGGAGC AT CT GCAACAGCCAT GGCT 605 

TTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTT 1248 

M I I II I I II I II I I II II II I I M I II I II II I II M I II II II I I I II II I I 

TTGCTGACGAAGACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATC 665 



I II I I I II 



Qy 


889 


Db 


306 


Qy 


949 


Db 


366 


Qy 


1009 


Db 


426 


Qy 


1069 


Db 


486 


Qy 


1129 


Db 


546 


Qy 


1189 


Db 


606 


Qy 


1249 


Db 


666 



RESULT 13 

BE723927 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BE723927 524 bp mRNA linear EST 25-APR-2001 

198406 MARC 4BOV Bos taurus cDNA 5', mRNA sequence. 

BE723927 

BE723927 . 1 GI: 10125223 
EST. 

Bos taurus (cow) 
Bos taurus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 
Bovidae; Bovinae; Bos. 
1 (bases 1 to 524) 

Smith, T. P. L. , Grosse,W.M. , Freking, B . A. , Roberts , A. J. , Stone, R.T., 
Casas,E., Wray,J.E., White, J., Cho,J., Fahrenkrug, S . C . , 
Bennett, G.L. , Heaton,M. P. , Laegreid, W . W. , Rohrer, G.A. , 
Chitko-McKown,C.G. , Pertea,G., Holt, I., Karamycheva, S . , Liang, F., 
Quackenbush, J. and Keele,J.W. 

Sequence evaluation of four pooled-tissue normalized bovine cDNA 

libraries and construction of a gene index for cattle 

Genome Res. 11 (4), 626-630 (2001) 

21180013 

11282978 

Contact: Smith TPL 

USDA, ARS, US Meat Animal Research Center 
PO Box 166, Clay Center, NE 68933-0166, USA 
Tel: 402 762 4366 
Fax: 402 762 4390 

Email : smith@email . marc . usda . gov 

Single pass sequencing. Bases called and alt_trimmed with phred 
vO. 980904. e. Vector identified by crossmatch with the -minscore 18 
and -minmatch 12 options . 
PCR PRimers 

FORWARD: AGGAAACAGCTATGACCAT 
BACKWARD : GTTTTCCCAGTCACGACG 
Plate: 106 row: L column: 14 
Seq primer: ATTTAGGTGACACTATAG . 

Location/Qualifiers 

1. .524 

/organism= M Bos taurus" 
/mol_type-"mRNA" 
/db_xref-"taxon: 9913" 
/tissue__type="pooled" 
/lab_host="DH10B" 
/clone_lib="MARC 4BOV" 

/note="Vector: pCMV SPORT 6; Site_l : NotI; Site_2 : Sail; 
Library made from pooled tissue from day 20 and day 40 
embryos . " 



ORIGIN 



Query Match 15.7%; Score 274.2; DB 10; Length 524; 

Best Local Similarity 84.7%; Pred. No. 2e-63; 

Matches 331; Conservative 0; Mismatches 58; Indels 2; 



Gaps 



2; 



8 8 9 T C AACAGACT G GAAC CAGACT GCAT AT GGG CT T C CAGAT CC CAAGACT ACAGAAGAGG CA 948 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

130 TCAACATCCTGGAACCAGACTGCATACGGGCCTCTTGCTCCCAGGGAGAAACAGGAGGCA 189 



Qy 949 GACATGATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGT 1008 

I I I I I I I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 190 GACATGATCTTGCCGATTGTCCTCAAGTATCTCTGCCCCGTGTACATTTCTTACTTTGGT 24 9 

Qy 1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTC-TTCCATCTTGTCAGCAAG 1067 

Mill II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 50 CTTGGAGCCGTTTCTGCTGCTGTCATGTCCTCAGCAGATTCTTTCCATCTTGTCAGCAAG 309 

Qy 1068 T T C C AT GT T T G CAC GGAAC AT CT AC C AGC TT T C C T T C AGAC AAAAT GCT T C GGAC AAAGA 1127 

III I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 310 T T C GAT GTTTGCTC GC AAC AT C T AC C AGCT T T CAT T C AGAC AAAAT G CT T CT GAC AAG GA 369 

Qy 1128 AAT C GT T T G GGT T AT GC GAAT C AC AGT GT T T GT GTT T GGAGC AT CT GC AAC AGC CAT G GC 1187 

II II I I I II I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 370 GAT AGT CT G GGT C AT GC GC AT CAC GGT ATT T GT GT TT GGAGCTT CTGC GAT GAC CAT GGC 429 

Qy 118 8 CTTGCTGACGAAAACTGTGTATGGGCTC-TGGTACCTCAGTTCTGACCTTGTTTACATCG 124 6 

I I I I I I I I I I I II I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I II I I I I I I 
Db 4 30 CTTGCTAACGAAGACGGTGTATGGGCTCTTGGTACCTCAGCTCTGACCTGGTCTACATCA 489 

Qy 12 47 TTATCTTCCCCCAGCTGCTTTGTGTACTCTT 1277 

I I I I I I I I I III MM II M I I I I I 
Db 4 90 TCATCTTCCCGCAGTTGCTCTGCGTGCTCTT 52 0 



RESULT 14 

AL669749 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



AL669749 800 bp mRNA linear EST 14-JAN-2002 

AL669749 directional larval cDNA library Ciona intestinalis cDNA 
clone 052ZB03 5', mRNA sequence. 
AL669749 

AL669749.1 GI:18143007 
EST. 

Ciona intestinalis 
Ciona intestinalis 

Eukaryota; Metazoa; Chordata; Urochordata; Ascidiacea; Enterogona; 
Phlebobranchia; Cionidae; Ciona. 
1 (bases 1 to 800) 
Genoscope . 

Ciona intestinalis directional larval cDNA library 
Unpublished (2002) 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
IMPORTANT: this sequence may contain errors. The Ciona intestinalis 
library from which the clone was isolated may be contaminated with 
cDNAs from bacteria or other Eukarya. 

Directional larval cDNA library originate from Dr.M.Branno, 
Stazione A.Dohrn, Naples, Italy, and was prepared in 
pBluescript2SK+. 

Location/Qualifiers 

1. .800 

/organism="Ciona intestinalis" 



/mol_type="mRNA" 
/db_xref =" taxon : 77 19 " 
/clone="052ZB03" 

/clone_lib="directional larval cDNA library" 
/note-"Vector : pBluescript2SK+" 

ORIGIN 

Query Match 15.1%; Score 263.2; DB 9; Length 800; 

Best Local Similarity 62.7%; Pred. No. 2.3e-60; 

Matches 504; Conservative 0; Mismatches 286; Indels 14; Gaps 6; 

Qy 250 TGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTGC- 308 

I I I I I I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I 

Db 2 TGGACGCAAGCACCCATTGGATACGCTTGCGCGTTAATACTTGGCGGCTTATTCTTTGCG 61 

Qy 309 AAAAC CT AT G C GT T C AAAGG GGT AT GT GAC CAT GT T AGAC C C GT T T C AG CAAAT C TAT G G 368 

II II I II I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 

Db 62 AAGT AAAAT GC GAAGT GAGG GAT AT GT GAC GAT GT T GGAT C CACT GC AG CG CAAC T - T G G 120 

Qy 369 AAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTGC 428 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TCGAGTAATGGGAGCGTTTCTTTATATACCTGCACTTGCTGGAGAATTATTCTGGTCTGC 18 0 

Qy 42 9 AGCAATTT TCT CT G CTT T GGGAGC CAC CAT C AGC GT GAT CAT CGAT GT GGAT AT GCAC AT 488 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 181 AGCTATATTGGCCGCGTTGGGCGGTACCTTCA-TGTTATCATTGATCTTCATATAACTGC 239 

Qy 4 89 TTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTATTC 548 

I I II I I I I I I I I I I I I II I I I I I I I I III I I I I I I I I II 

Db 240 AG CT GT AAT AGTAT CT G C ATG CATT GCT GTT GTATACAC CAT GGC C GGT GGT CTT TACT C 299 

Qy 54 9 TGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGGATCAGCGT 608 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 300 GGTTGCTTATACAGATGTAGTTCAGTTGATTTGCATATTCATTGGACTGTGGTTGAGCAT 359 

Qy 609 CCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTGCATGCCAA. 668 

II II II II I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 3 60 T C CAT T C GCGT T CACT CAT CCT GCT GT AT CAGAC AT C G C CACT ACAGCT TAC CACT CAC C 419 

Qy 669 AT AC CAAAAGCC GT GGCT GGGAACT GTT GACT CATCT GAAGT CTACT CTTGGCTT GAT AG 728 

II I I I I I I I I I I II II I I I I I I 

Db 420 TAAC T GGCT T GGT ACT T GGGAT ATT T C GAC CACT GGT CT ATGGAT C GACT C 470 

Qy 729 TTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTCTC 788 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 471 TGCTCTGCTACTGTTATTTGGTGGAATACCGTGGCAAGTTTACTTTCAAAGAGTTTTATC 530 

Qy 7 89 TTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTGGT 848 

I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 531 GNCNAAAAGNNCANGAAGCGCTCAGAAGCTTTCATTCATTGCTGCGTTCGGATGTTTGTT 590 

Qy 84 9 GAT GGC CAT C C C AGC CAT ACT CAT T GGG GC C AT T G GAG CAT C AAC AGACT G GAAC CAGAC 908 

I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I.I I I I I I II 
Db 591 CAT GT CAAT AC CTT CGAT ATTNAT C GGT GCAATT GCT GC AT CTACAGAT T GGGACGCAAC 650 

Qy 909 TGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTACCAATTGT 968 

I I I I I I I I I I III III I I I I I I I I I I I I I I I I I I 



Db 



651 AT C GT AC GG C CT C C CAAGT C CAGTT GANAAAG G C GAC CAAGC CAAT AT T CTACC CAT T GT 710 



Qy 969 TCTGCAGTATCTC-TGCCCTGTGTATATTTCTTTCTTT-GGTCTTGGTGCAGTTTCTGCT 1026 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I 

Db 711 GCTTCAATACCTCACCCCCTGTAGCTGTATCATTCTTTGGGGCTTGGCGCTGTTTCTGCT 77 0 

Qy 1027 GCTGTTATGTCATCAGCAGATTCT 1050 

I I I I I I I I I I I I I I I I I I I I 

Db 771 GCTGTNATGTCATCTGCCGACTCT 7 94 



RESULT 15 

BW274870 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BW274 870 54 9 bp mRNA linear EST ll-NOV-20 02 

BW274870 Nori Satoh unpublished cDNA library, gastrula and neurula 
Ciona intestinalis cDNA clone cign070cl2 5', mRNA sequence. 
BW274870 

BW27487 0. 1 GI : 24855481 
EST. 

Ciona intestinalis 
Ciona intestinalis 

Eukaryota; Metazoa; Chordata; Urochordata; Ascidiacea; Enterogona; 
Phlebobranchia; Cionidae; Ciona. 
1 (bases 1 to 549) 

Satou,Y., Shin-i,T., Kohara,Y. and Satoh, N. 
Expressed genes in Ciona intestinalis (2002c) 
Unpublished (2002) 
Contact: Nori Satoh 
Department of Zoology 
Kyoto University 

Sakyo-ku, Kyoto, Kyoto 606-8502, Japan 
Tel: 81-75-753-4081 
Fax: 81-75-705-1113 

Email : satoh@ascidian . zool . kyoto-u.ac.jp. 
Location/Qualifiers 
1. .549 

/organism="Ciona intestinalis" 
/mol_type="mRNA" 
/db_xref="taxon:7719" 
/clone="cign070cl2" 
/tissue_type-"whole body" 
/dev_stage~"gastrula and neurula" 

/clone_lib="Nori Satoh unpublished cDNA library, gastrula 
and neurula" 



ORIGIN 



Query Match 13.0%; 
Best Local Similarity 64.4%; 
Matches 354; Conservative 



Score 225.8; DB 13; 
Pred. No. 3.4e-50; 
0; Mismatches 193; 



Length 549; 



Indels 



3; Gaps 



l; 



Qy 

Db 



11 ATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTGGTTGGAATAT 70 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

2 ATGTTCCTGGTTTAGTGNCTATTATCGTCTTCTACGTTGCTATTCTAGCGATCGGTATTT 61 



QY 
Db 



71 GGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAAGCCATCATAG 130 

I I I I II I I I I III I II II I I I III I I I I I I I I II I I 

62 AT G C AG CAT G GAG G AAAAG AAG AAC C G G AAG AG G AAAC GAG AGC GAGACAAT CAT GG 118 



Qy 131 TTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCTACCTGGGTCG 190 

| | I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 119 TCGGGGGAAGAGACATCGGACTCTTTGTTGGAAGCTTTACTATGACTGCTACGTGGGTAG 17 8 



Qy 191 GAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTAGCTT 2 50 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 179 GTGGTGGTTACATCAACGGCACAGCAGAAGTTGTATACACCCCGGGTTCCGGTCTACTGT 238 

Qy 251 GGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTGCAA 310 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 239 GGACACAAGCGCCATTTGGTTACGGCTGCAGCCTCATGCTTGGCGGGTTGTTTTTCGCTA 298 

Qy 311 AAC CT AT GC GT T CAAAG G GGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AAAT CT AT GGAA 370 

I Mill I I I I I I I I I I I I I I I I I I M I I I I I I I II I 

Db 2 99 AGAAAATGCGGACTCAGGGTTACGTCACCATGCTGGATCCATTGCAACGTAAGCTTGGCA 358 

Qy 371 AACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTGCAG 430 

MINIMUM III I I I I I I II I I II I I I I I I I I I I I I I I I 

Db 359 GGCGCATGGGGGGTCTGTTGTACTTACCAGCACTCTTGGGTGAAATATTCTGGTCAGCCG 418 

Qy 431 C AAT TTTCTCTGCTT T GGGAG C C AC CAT CAG C GT GAT CAT C GAT GT GG AT AT G C AC AT T T 4 90 

Ml I I I I I I I I I II I I I I I I I I I I I I I I I I I I I III 
Db 419" CCAT C CT T GCCGC TCT T GGC GGT AC AT T GT C CGT GAT CAT AGAC CTT GAT ATT C GT AT CT 478 

Qy 491 CTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTATTCTG 550 

Mill M I I I I I I I II I I I 

Db 479 CTGTCATTGTATCTGCATGTATTGCTGTGTTGTATACGTTGGTTGGTGGTCTGTATTCGG 538 

Qy 551 TGGCCTACAC 560 

II I I I I II 

Db 539 TGGCTTATAC 54 8 



-Search completed: March 22, 2004, 15:16:56 
Job time : 47 65 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: March 22, 2004, 09:56:13 ; Search time 6973 Seconds 

(without alignments) 
10834.205 Million cell updates/sec 

Title: US-10-069-54 1-5 

Perfect score: 1743 

Sequence: 1 atggctttccatgtggaagg ctgaagataatttacagtga 1743 



Scoring table: 



Searched: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

3470272 seqs, 21671516995 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database 



GenEmbl : * 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 



gb_ba: * 
gb_htg : * 
gb_in: * 
gb_om : * 
gb_ov: * 
gb_pat : * 
gb_ph : * 
gb_pl : * 
gb_pr : * 
gb_ro : * 
gb_sts : * 
gb_sy : * 
gb_un : * 
gb_vi : * 
em_ba : * 
em_f un : * 
em_hum : * 
em_in : * 
em_mu : * 
em_om: * 
em_or : * 
em_ov: * 
em_pat : * 
em_ph : * 
em_pl : * 
em_ro : * 
em sts:* 



28 


em 


un: * 


29 


em 


vi : * 


30 


em 


htg hum: * 


31 


em 


htg inv:* 


32 


em_ 


htg_other : * 


33 


em 


htg mus : * 


34 


em 


htg pin: * 


35 


em 


htg rod:* 


36 


em_ 


htg mam: * 


37 


em 


htg vrt:* 


38 


em 


sy : * 


39 


em 


htgo hum: * 


40 


em 


htgo mus : * 


41 


em 


htgo other:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



1 O . 


S core 


Match 


T a t~i /T ~t~~ ri 


U D 


_L U 




1 


1 7 4 ? 


100.0 


174? 


fs 

u 


R4 QR7 1 


T?4QQ71 Mi rrVi — a FFH rt i 


O 


174? 


100. 


0 


174? 


D 


pmni 97 1 q 


RDD1971Q Hi rrh — af f i 

/ X z? iiiyU clXXX 




1 / 4 J 


100. 


0 


174? 


Q 

z> 


AF97 6fl7 1 


AF976fi71 Hnmn cani 


4 


1743 


100. 


0 


1813 


9 


HSA4 014 66 


AiT4014f-ifi Homo ^r^ni 


•j 


174? 


100. 


0 


S1 Sft 

J1JO 


q 


AR04 ?QQ7 


ARD4?QQ7 Homo c: =n^i "i 


6 


1738.2 


99. 


7 


1743 


6 


AR268949 


AR268949 Sequence 


7 


1394.2 


80. 


0 


1743 


6 


E49870 


E49870 High-affini 


8 


1394.2 


80. 


0 


1743 


6 


BD012718 


BD012718 High-affi 


9 


1394.2 


80.0 


4904 


10 


AB030947 


AB030947 Rattus no 


10 


1375 


78. 


9 


1743 


10 


AF276872 


AF276872 Mus muscu 


11 


1373.4 


78. 


8 


1743 


6 


E49872 


E49872 High-affini 


12 


1373.4 


78. 


8 


1743 


6 


BD012720 


BD012720 High-affi 


13 


1373.4 


78. 


8 


4938 


6 


AX080443 


AX080443 Sequence 


14 


1367 


78. 


4 


1743 


10 


MMU401467 


AJ4014 67 Mus muscu 


15 


867 


49. 


7 


2528 


5 


TMA420808 


AJ420808 Torpedo m 


16 


730 


41. 


9 


1132 


5 


GGA511267 


AJ511267 Gallus ga 


17 


630.8 


36. 


2 


2239 


9 


HSA308384 


AJ308384 Homo sapi 


18 


630.8 


36. 


2 


190043 


9 


AC009963 


AC009963 Homo sapi 


19 


502.8 


28. 


8 


232792 


2 


AC106657 


AC106657 Rattus no 


20 


501.2 


28. 


8 


155131 


2 


AC102873 


AC102873 Mus muscu 


21 


431.4 


24. 


8 


3326 


3 


AY011119 


AY011119 Limulus p 


22 


405.8 


23. 


3 


3255 


3 


AY047521 


AY047521 Drosophil 


23 


363.8 


20. 


9 


1731 


6 


E49869 


E49869 High-affini 


24 


363.8 


20. 


9 


1731 


6 


BD012717 


BD012717 High-affi 


25 


363. 8 


20. 


9 


1985 


3 


AB030946 


AB030946 Caenorhab 


26 


279.6 


16. 


0 


386 


6 


AX080449 


AX080449 Sequence 


27 


242.6 


13. 


9 


1461 


6 


AX432086 


AX432086 Sequence 


28 


226 


13. 


0 


1657 


9 


HSA308383 


AJ308383 Homo sapi 


29 


179.6 


10. 


3 


1178 


9 


HSA308378 


AJ308378 Homo sapi 


30 


179.6 


10. 


3 


186989 


3 


AC007812 


AC007812 Drosophil 


31 


179.6 


10. 


3 


189117 


3 


AC009395 


AC009395 Drosophil 


32 


179.6 


10. 


3 


255620 


3 


AE003723 


AE003723 Drosophil 


33 


167.6 


9. 


6 


140156 


2 


AC017381 


AC017381 Drosophil 





34 


163 


9 


. 4 


2326 


9 


HSA308379 


AJ308379 Homo sapi 




35 


155 


8 


. 9 


1467 


9 


HSA308382 


AJ308382 Homo sapi 


c 


36 


151. 8 


8 


.7 


40893 


3 


CBRG45E19 


AC084631 Caenorhab 




37 


150.2 


8 


. 6 


736 


9 


HSA308381 


AJ308381 Homo sapi 




38 


150 


8 


. 6 


1308 


9 


HSA308380 


AJ308380 Homo sapi 


c 


39 


141. 6 


8 


. 1 


39908 


3 


CEC48D1 


Z81049 Caenorhabdi 


c 


40 


141.6 


8 


. 1 


330724 


2 


CEY67H2 


AL022475 Caenorhab 


c 


41 


132 


7 


.6 


152021 


2 


AC010923 


AC010923 Drosophil 




42 


98 


5 


. 6 


616 


11 


G84799 


G84799 S208P6036FB 


c 


43 


51.8 


3 


.0 


10732 


6 


E32986 


E32986 Gene encodi 


c 


44 


49 


2 


.8 


2000 


6 


AX655393 


AX655393 Sequence 




45 


48.2 


2 


.8 


2781 


1 


BSU92466 


U92466 Bacillus su 



ALIGNMENTS 



RESULT 1 

E49871 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



ORIGIN 



E49871 1743 bp DNA linear PAT 27-AUG-2002 

High-affinity choline transporter. 

E49871 

E49871.1 GI:22554902 
JP 2001136976-A/3. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1743) 

Haga,T. and Okuda,T. 

High-affinity choline transporter 

Patent: JP 2001136976-A 3 22-MAY-2001; 

SCIENCE & TECH AGENCY 

OS Homo sapiens (human) 

PN JP 2001136976-A/3 

PD 22-MAY-2001 

PF 27-DEC-1999 JP 1999368991 

PI TATSUYA HAG A , T AKAS H I OKU DA 

PC Cl2N15/09,A01K67/027,A61K38/00,C07K14/47,C07K16/18,C07K19/00, 
PC C12N5/10, 

PC C12P2 1/02 f C12P21/08 ,01201/00, C12N15/00, A61K37/02, C12N5/ 00 CC 
FH Key Location/Qualifiers 
FT CDS (1) . . (1743) . 

Location/Qualif iers 

1. .1743 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db xref="taxon:9606" 



Query Match 100.0%; Score 1743; DB 6; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 1743; Conservative 0; Mismatches 0; 



Length 1743; 
Indels 0; 



Gaps 



0; 



Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 



Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 




781 


Db 


781 


Qy 


841 


Db 


841 



GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I M I M I I M I I 

GT T GGAAT AT GGGCTGCCT GGAGAAC C AAAAACAGT GGCAGC G CAGAAGAGC G CAGCGAA 120 

GC C AT CAT AGTT GGT GGCCGAGAT ATT GGTT T ATT GGTT GGT GGATT T ACCAT GACAGCT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GC CAT CAT AGTT GGT GGCC GAGATAT T GGTT TAT T GGTT GGT GGATTTAC CAT GACAGCT 180 

AC CT G GGT C GGAGGAGG GT AT AT CAAT GGCACAGC T GAAGC AGTT TAT GT AC CAGGTT AT 24 0 
I | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AC CT GGGT C G GAGGAGGGTAT AT CAAT G G C AC AG CT GAAGC AGTT TAT GT AC CAGGTT AT 24 0 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

AT CT AT GGAAAAC G CAT G G GC GGACT CCT GT TT AT T C CT GC ACT GAT GGGAGAAAT GT T C 42 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I 
AT CT AT GGAAAACGCAT GGGC G GACT C CT GT T TAT T CCT GC ACT GAT GGGAGAAAT GT T C 420 

T GG GCT GC AG CAAT TTTCTCTGCTTT GGGAGC C AC CAT C AGC GT GAT CAT C GAT GT GGAT 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 8 0 

AT GCACAT TT CT GT CAT CAT CT CT GC ACT CAT T GCCACT CT GTAC ACACT G GT GGGAG GG 540 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 540 

CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 
| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AT CAGC GT CC CCTT T G C ATT GT C ACAT C CT GCAGT C GC AGACAT C GGGTT C ACT GCT GT G 660 

CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

CAT GCCAAATACCAAAAGCCGTGGCTGGGAACT GTT GACT CATCT GAAGT CT ACT CTT GG 720 

CTT GATAGTTTT CT GT T GTT GAT GCT GGGT GGAAT CCCAT GGCAAGCATACTTT CAGAGG 780 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

T GCCTGGT GAT GGCC AT C CCAGC CAT ACT CATT GGGGCCATT GGAGCAT CAAC AGACT GG 900 

I | | | | I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I II I I I I I I I II 

T GC CT G GT GAT GG C CAT CC CAGC CAT ACT CATT G GGGCC AT T G GAG CAT CAAC AGACT GG 900 



Qy 901 AACCAGACTGCATAT GGGCTT CCAGAT CCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 901 AACCAGACTGCATAT GGGCTT C CAGAT CCCAAGACTACAGAAGAGGCAGACAT GATTTTA 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I 

Db 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 C G GAAC AT CT AC CAG CTTTCCTT CAGACAAAAT G CT T C GGACAAAGAAAT C GT TT GGGT T 114 0 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 AT GC GAAT CAC AGT GT TT GT GT TT GGAG CAT CT GCAACAGC C AT GGCCT TGCT GAC GAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

Qy 1381 T T CT AC C CT GGCT ATT AC C CT GAT GAT AATGGTATATAT AAT CAGAAAT T T C CATT T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 T T CT ACC CT GGCT ATT AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAAT T T C CATT T AAA 14 4 0 

Qy 1441 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 1441 ACACT T GC CAT GGTT ACAT CAT T CT T AAC CAACAT TT GCAT CT CCT AT CTAG C CAAGTAT 1500 

Qy 1501 CT ATT T GAAAGT GGAACCT T GC CAC CT AAATT AGAT GTAT TT GAT GCT GT T GT T GCAAGA 1560 

I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1501 C TAT T T GAAAGT G GAAC C T T G C CAC C T AAAT T AGAT GTAT T T GAT GCTGTTGTTG C AAGA 1560 

Qy 1561 CAC AGT GAAG AAAACAT G GAT AAGAC AAT T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1561 CAC AGT GAAGAAAACAT G GAT AAGAC AAT T CT T GT CAAAAAT GAAAAT AT T AAATT AGAT 1620 

Qy 1621 GAACTTGCACTTGTGAAGCCACGACAGAGCAT GACCCTCAGCT CAACTTT CAC CAATAAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 1621 GAACT T GCACT T GTGAAG C CAC GAC AGAG CAT GAC C CT C AGCT CAACTT T CAC CAATAAA 1680 

Qy 1681 GAGG C CT T C C T T GAT GT T GAT T C C AGT C C AGAAGGGT C T GGGACT GAAG AT AAT T T ACAG 17 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 GAGGC CT T C CT T GAT GT T GAT T C CAGT C C AGAAGGGT C T G GGACT GAAG AT AAT T T ACAG 1740 

Qy 1741 TGA 1743 



Db 1741 TGA 1743 



RESULT 2 
BD012719 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



BD012719 1743 bp DNA linear PAT 02-AUG-2002 

High-affinity choline transporter. 

BD012719 

BD012719.1 GI:22092908 
WO 0116315-A/3. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1743) 
Haga,T. and Okuda,T. 
High-affinity choline transporter 
Patent: WO 0116315-A 3 08-MAR-2001; 

JAPAN SCIENCE AND TECHNOLOGY CORP,TATSUYA HAGA , T AKAS H I OKU DA 
OS Homo sapiens (human) 
PN WO 0116315-A/3 
PD 08-MAR-2001 

PF 18-AUG-2000 WO 2000JP005545 

PR 27-AUG-1999 JP 99P 24 0642 , 27-DEC-1999 JP 99P 368991 PI 
TATSUYA HAGA , T AKAS H I OKU DA 

PC C12N15/12,C07K14/47, C12Q1/68, C07K19/00, C07K16/18, C12N5/10, PC 
A61K38/17, 

PC A61K45/00,A61P25/28,G01N33/53,A01K67/027 
CC 

FH Key 
FT CDS 



FEATURES 

source 



ORIGIN 



Location/ Qualifiers 
(1). .(1743). 

Location/Qualifiers 

1. .1743 

/organism="Horno sapiens" 
/mol_type-" genomic DNA" 
/db xref="taxon: 9606" 



Query Match 100.0%; Score 1743; DB 6; Length 1743; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 

Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 



QY 
Db 

Qy 

Db 



61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 



121 



180 



GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 
I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
121 G CCAT CAT AGT T GGT GGCCGAGAT ATT GGTT T AT T GGT T GGT GGAT TT AC CAT GACAG CT 180 



Qy 



181 ACC T G G GT CGGAGGAGGGTAT AT CAAT GG C ACAGCT GAAGCAGTT T AT GT AC CAG GTT AT 
Ml I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 



240 



Db 



181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 240 



Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 T T CTT T GC AAAACCT ATGC GTT CAAAGGG GTAT GT GAC CAT GT T AGAC C C GT T T CAG CAA 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 361 AT CT AT GGAAAAC GC AT G GG C GGACT C CT GT T TAT T C CT G C ACT GAT G G GAGAAAT GT T C 420 

Qy 421 T GGG C T GCAGCAATT TTCTCTGCTT T GGGAGC CAC CAT CAGC GT GAT CAT CGAT GT GGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 80 

Qy 481 AT GC ACATT T CT GT CAT CAT CT CT GCACT CAT T GC CACT CT GT ACACACT GGT GGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 481 AT GC ACATTT CT GT CAT C ATCT CT GCACT CAT T GC CACT CT GT AC ACACT GGT GGGAGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 AT CAGCGT CCCCTTTG CAT T GT CAC AT C CT GC AGT C GCAGAC AT CGGGT T CACT GCTGT G 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 C AT GC CAAAT AC CAAAAGC CGT GGCT GG GAACT GT T GACT CAT CT GAAGTCT ACT CT T G G 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 7 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

Qy 841 T G C CT GGT GAT GGCCAT C CCAG CCAT ACT CAT T GGGGC CAT T GGAGCAT CAAC AGACT GG 900 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 841 T GCCT GGT GAT GGCCATCCCAGCCATACT CAT T GGGGC CAT T GGAGCAT CAACAGACT GG 900 

Qy 901 AACCAGACTGCATATGGGCTT C CAGATC CCAAGACTACAGAAGAGGCAGACATGAT TTTA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 AACCAGACTGCATAT GGGCTT CCAGAT CCCAAGACTACAGAAGAGGCAGACAT GATTTT A 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

Qy 1021 TCTGCTGCTGT TAT GT CAT CAG C AGAT T CT T C CAT CTT GT C AGC AAGT T C CAT GT T T G C A 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 



Qy 


1081 


Db 


1081 


Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 


Qy 


1261 


Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 




lUUi 


Db 


1681 


Qy 


1741 


Db 


1741 



C GGAAC AT CT AC C AGCT T T C C T TCAGACAAAAT G CTT C GGACAAAGAAAT C GT T T G G GT T 1140 
I I I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
C GGAAC AT CT AC C AGCT T T C CT T C AG AC AAAAT GCTT C G GAC AAAGAAAT CGTTTGGGTT 1140 

ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AT GCGAAT CACAGT GTT T GT GTTT G GAGCAT C T GCAAC AG C CAT GGC CT T G CT GAC GAAA 12 00 

ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I | | | | | | | | | | | | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I | | | | | | | | | | | I I I I I I I I I I I I I I I I I II I I I II I I I I II I I II I I I I I I I M I I I I I 

CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

TCTGGCCTCTTCCT GAGAAT AACT G GAGGGGAGC C AT AT CT GT AT CTT CAGC C CTT GAT C 1380 

I I I ! I I I I II I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

T CT GGCCT CT T CCT GAGAAT AACT GGAG GG GAGC C AT AT C T GT AT CT T C AG C C CT T GAT C 138 0 

T T CT AC C C T G G C T ATT AC C C T GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 
I | I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T T C T AC C CT G G CT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AG AAAT T T C CAT T T AAA 1440 

AC ACT T G C CAT G GT T AC AT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT CT AGC CAAGT AT 1500 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I 
ACACTT GCCAT GGTTACAT CATT CTTAACCAACATTT GCAT CT CCTAT CTAGCCAAGTAT 1500 

CT AT T T GAAAGT G G AAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT GCTGTTGTT GCAAGA 1560 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

CT AT TT GAAAGT GGAAC CT T GC CACC T AAAT T AGAT GT AT T T GAT GCT GT T GT T GCAAGA 1560 

CACAGT GAAGAAAAC AT G G AT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CACAGT GAAGAAAAC AT GG AT AAGAC AAT T C T T G T C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

GAACTTGCACTTGTGAAGCCACGACAGAGCATGACCCTCAGCTCAACTTTCACCAATAAA 1680 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAACTT GCACTT GTGAAGCCACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1680 

GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 17 40 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

GAG GCCTTCCTT GAT GTT G ATT C C AGT C C AGAAGG GT C T G G GAC T GAAG AT AAT T T AC AG 17 40 



I I 



RESULT 3 
AF276871 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 



AF276871 1743 bp mRNA linear PRI 27-NOV-2000 

Homo sapiens high affinity choline transporter (SLC5A7) mRNA, 
complete cds . 
AF276871 

AF276871.1 GI: 10998441 



KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 1743) 

AUTHORS Apparsundaram, S . , Ferguson, S . M. , George, A. L. Jr. and Blakely, R. D. 
TITLE Molecular cloning of a human, hemicholinium-3-sensitive choline 

transporter 

JOURNAL Biochem. Biophys . Res. Commun . 276 (3), 862-867 (2000) 
MEDLINE 20483599 
PUBMED 11027560 
REFERENCE 2 (bases 1 to 1743) 

AUTHORS Apparsundaram, S . , Ferguson, S .M. and Blakely, R.D. 
TITLE Direct Submission 

JOURNAL Submitted ( 09- JUN-2000 ) Department of Pharmacology and Center for 

Molecular Neuroscience, Vanderbilt University, 23rd Avenue South at 
Pierce, Nashville, TN 37232-6420, USA 
FEATURES Location/Qualifiers 
source 1. .1743 

/organism="Homo sapiens" 
/moltype^'mRNA" 
/db_xref="taxon: 9606" 
/ chromosome="2 11 
/map="2ql2" 
gene 1. .1743 

/gene="SLC5A7" 
CDS 1. .1743 

/gene="SLC5A7 " 

/note="hCHT; solute carrier family 5 member 7" 
/codon_start-l 

/product="high affinity choline transporter" 
/proteinjLd="AAG25940. 1" 
/db_xref="GI: 10998442" 

/ translation="MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIV 
GGRDIGLLVGGFTMTATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFF 
AKPMRSKGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVD 
MHI SVI I SAL I AT L YT LVGGL YS VAYT D WQL FC I FVGLWI SVPFALSHPAVADIGFT 
AVHAKYQKPWLGTVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FL 
AAFGCLVMAI P AI L I GAI GAS T DWNQT AYGL P D P KTT E EADMI L P I VLQ YL C P VYT S F 
FGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASA 
TAMALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGG 
EPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 
P KL D VF DAWARH SEENMDKTI L VKN EN I K L D E LAL VK P RQ SMT L S S T FT N K EAFL D V 
DSSPEGSGTEDNLQ" 

ORIGIN 



Query Match 100.0%; Score 1743; DB 9; Length 1743; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 AT GGCT T T C C AT GT GGAAG GACT GAT AGCT AT CAT C GT GT T CT AC CT T CT AAT T T T GCT G 60 



QY 



61 GTT GGAAT AT GGGCT GC CT GGAGAAC CAAAAACAGT GG CAG C GCAGAAGAGC G C AGC GAA 120 
I I I I I I I I II I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I 



61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

121 GC CAT C AT AGT TGGTGGCC GAGAT AT T GGT T TAT T G GT T G GT GGAT T T AC CAT GAC AG CT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I 

121 G C CAT CAT AGT TGGTGGCC GAGAT AT T GGT T TAT TGGTTGGTG GAT T T AC CAT GAC AG CT 180 

181 AC CT GGGT C GGAG GAG GGT AT AT CAAT GGCACAGCT GAAGCAGT TT AT GT AC CAGGT TAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 ACCT GGGT C GGAGGAG GGT AT AT CAAT GGCACAGCT GAAGCAGTTT ATGT AC CAGGTTAT 24 0 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 48 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

601 AT C AGC GT C C C CT T T GC AT T GT C AC AT C CT GCAGT C GC AGACAT C GG GTT C ACT GCT GT G 660 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

661 CATGCCAAATAC CAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CTACT CTT GG 72 0 

I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 CAT GC CAAATAC C AAAAGC C GT GGCT GGGAACT GTT GACT CAT CT GAAGT CTACT CT T GG 72 0 

721 CTT GAT AGT TTTCTGTTGTT GAT G CT G G GT G GAAT C C CAT GG CAAGC AT ACT TT C AGAGG 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 90 0 

901 AAC C AGACT GC AT AT G GGCT T C C AG AT C C C AAGACT AC AG AAGAG G C AGACAT GAT T T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II 
901 AACCAGACT GCATAT GGGCTTCCAGAT CCCAAGACTACAGAAGAGGCAGACAT GATTTTA 960 



Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

Qy 1081 C G GAAC AT C T AC C AG CTTTCCTT C AG AC AAAAT G C T T C G G AC AAAG AAAT CGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

Qy 1141 AT GCGAAT CACAGT GTTT GT GTTTGGAGCAT CT GCAACAGCCAT GGC CTT GCT GACGAAA 1200 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I 

Db 1141 AT GCGAAT CACAGT GT TT GT GT T TGGAG CAT CT GCAACAGCCAT GGC CT T GCT GAC GAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I II I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 ' CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 13 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

Qy 1381 T T CTAC C CT G GCT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT C AGAAATT T C C AT TT AAA 14 4 0 

I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 TT CTACCCT GGCTATTACCCTGATGATAAT GGT AT AT AT AAT CAGAAATT TCCATTTAAA 14 40 

Qy 1441 AC ACT T G C CAT GGT T ACAT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT CT AGC C AAGT AT 15 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 AC ACT T GC CATGGTT ACAT CAT T CT T AAC CAACAT T T GCAT CT CC T AT CT AGC CAAGT AT 1500 

Qy 1501 CT AT T T GAAAGT G GAAC CT T G C C AC C T AAAT T AGAT GT AT T T GAT GCTGTTGTT GC AAG A 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1501 CTATTT GAAAGTGGAACCTTGC CACCTAAATT AGAT GTATTT GATGCTGTTGTT GCAAGA 1560 

Qy 1561 CACAGT GAAGAAAAC AT GGAT AAGACAAT T CTT GT C AAAAAT GAAAATAT T AAATT AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1561 CACAGT GAAGAAAAC AT GGAT AAGACAAT T C TT GT C AAAAAT GAAAATAT T AAAT T AGAT 1620 

Qy 1621 GAACT T GC ACT T GT GAAGC C AC GAC AGAGC AT GAC C CT C AGCT CAACTT T CAC C AAT AAA 1680 

I I I I II I II I I I I I I I I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAACTT GCACTT GTGAAGC CACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 168 0 

Qy 1681 GAGGC CT T CC T T GAT GT T GAT T C CAGT C C AGAAG G GT CT GGGACT GAAGAT AAT T T AC AG 1740 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGCCTT CCTT GAT GTT GAT T CCAGTC CAGAAGGGT CT GGGACT GAAGAT AATTTACAG 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 
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HSA401466 1813 bp mRNA linear PRI 16-AUG-2000 

Homo sapiens mRNA for high affinity choline transporter (CHT1 
gene) . 
AJ401466 

AJ401466.1 GI:9843753 

ChTl gene; high affinity choline transporter. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Wieland,A. , Bonisch,H. and Bruss,M. 

Molecular cloning of the human and murine high affinity choline 

transporters and characterization ofthe human gene-structure 

Unpublished 

2 (bases 1 to 1813) 

Bruess , M. 

Direct Submission 

Submitted ( 14-AUG-2000) Bruess M. , University of Bonn, Pharmacology 
and Toxicology, Reuter str. 2b, D-53113 Bonn, GERMANY 

Location/ Qualifiers 

1. .1813 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/ ch r omo s ome= "2" 

/map="2qll-13" 

/tissue_type="hypothalamus " 

1. .1813 

/gene="CHTl" 

19. .1761 

/gene="CHTl" 

/function="sodium- and chloride-dependent reuptake of 

choline" 

/codon_start=l 

/ evidence=experimental 

/product="high affinity choline transporter" 
/protein_id="CAC03717 .1" 
/db_xref="GI: 9843754" 
/db^xref ="GOA: Q9GZV3 " 
/db_xref ="SPTREMBL : Q9GZV3 " 

/translation="MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIV 
GG RD I GLLVGG FTMT ATWVGGG Y I NGT AEAVYVP G YGLAWAQAP I G Y S L S L I L GGL FF 
AKPMRS KGYVTMLDP FQQI YGKRMGGLLFI P ALMGEM FWAAAI FSALGATI S VI IDVD 
MHI SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFT 
AVHAKYQKPWLGTVDS SEVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FL 
AAFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVXQYLCPVYISF 
FGLGAVS AAVMS SAD S S I L S AS SMFARN IYQLS FRQNAS DKEI VWVMRI T VFVFGAS A 
TAMALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGG 
EPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 
P K L D VF D AWARH SEE NMD KT I L VKN EN I K L D E LAL VK P RQ S MT LSSTFTNK EAF L D V 
DSSPEGSGTEDNLQ" 



ORIGIN 



Query Match 100.0%; Score 1743; DB 9; Length 1813; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I 

Db 19 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 7 8 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 9 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 138 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I 1 1 I 1 1 I I I I 1 1 I I I 1 1 I 1 1 I I I I 1 1 I 1 1 1 1 I i I I 1 1 I 1 1 I I I 1 1 1 1 1 I I I 1 1 1 1 1 I I I 

Db 139 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 198 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 199 ACCT G GGT CG GAGGAGGGT AT AT CAAT GGCACAGCT GAAGC AGT T TAT GT AC CAG GTT AT 258 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 259 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 318 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 319 T T CT T T G CAAAAC CT AT G C GT T CAAAGG GGT AT GT GAC CAT GTT AGAC C CGT T T C AGCAA 37 8 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGA/^iATGTTC 42 0 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I 
Db 37 9 AT CT AT GGAAAACGCAT GG GC GGACT C CT GT T TAT T CCT GC ACT GAT GGGAGAAAT GT T C 438 

Qy 421 T GGGC T GCAG CAAT TTT CT CT GCT T T G GGAGC CAC CAT CAGC GT GAT C AT CGAT GT GGAT 480 

I I I I II I I I IS I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 439 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 498 

Qy 4 81 AT GCACATTT CT GT CATCAT CT CT GCACT CATT GC CACT CT GTACACAC T GGT GGGAG GG 54 0 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 99 AT GC ACAT TT CTGT CAT CAT C T CT G C AC TCAT T GCCACT CT GT ACACACT G GT GGGAGGG 558 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 559 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 618 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 619 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 67 8 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 679 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 738 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 39 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 798 



Qy 



781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 



Db 



I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

799 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 858 



Qy 841 T GC CTG GT GAT GGC CAT C CC AGC C AT ACT C ATT G GGGC C AT T G GAG CAT CAACAGACT GG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 859 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 918 

Qy 901 AAC C AGACT G CAT AT GGGCTT C C AGAT C C C AAGAC T AC AGAAG AGGCAGAC AT GATT T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 919 AAC CAGACT G CAT AT G GGCT T C CAGAT C C CAAGACT ACAGAAGAGGC AGAC AT GATT T TA 978 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 979 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1038 

Qy 1021 T CT GCT GCT GT TAT GT CAT C AGCAGAT T CTTCC AT CT T GT CAGCAAGTT C CAT GTTT GCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1039 T CT GCT G CT GTT AT GT CAT C AG CAGAT T CT T CC AT CT T GT CAGCAAGT T C CAT GT TT GCA 1098 

Qy 1081 C G GAAC AT CT AC C AGCT T T C CT T C AGAC AAAAT G C T T C G GACAAAGAAAT C GTT T GGGT T 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1099 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1158 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 1159 AT GCGAAT CACAGT GT TT GT GTTT GGAGCAT CT GCAAC AGC CAT GGC CTT GCT GAC GAAA 1218 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1219 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1278 

Qy 12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1279 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1338 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1339 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1398 

Qy 1381 T T CT AC C CT GGCT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAAT T T C CAT T T AAA 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1399 T T C T AC C C T G GCT AT T AC C C T GAT GAT AAT G GT AT AT AT AAT CAGAAAT T T C CAT T T AAA 1458 

Qy 1441 ACACTT GCCATGGTTACAT CATT CTTAACCAACATTTGCAT CT CCT ATCTAGCCAAGTAT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1459 ACACTT GCCATGGTTACAT CATT CTTAACCAACATTT GCATCT CCT ATCTAGCCAAGTAT 1518 

Qy 1501 C TAT T T GAAAGT G GAAC CT T GC C AC C T AAAT T AGAT GT AT T T GAT GCTGTTGTT GC AAG A 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1519 CT AT T T GAAAGT GGAAC CT T G C C AC CT AAAT T AGAT GT AT T T GAT GCTGTTGTT GC AAG A 1578 

Qy 1561 CACAGT G AAGAAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 162 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 157 9 CACAGT GAAGAAAAC AT G GAT AAG AC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1638 

Qy 1621 GAACT T G C ACT T GT GAAG C C AC GAC AG AG CAT GAC CCT C AG CT CAAC T T T C AC C AAT AAA 168 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



1639 GAACTTGCACTT GTGAAGCCAC GACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1698 



Qy 1681 GAGGCCTTCCTT GAT GTT GATTCCAGT CCAGAAGGGTCTGGGACT GAAGATAATTTACAG 17 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1699 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 1758 

Qy 1741 TGA 1743 

I I I 

Db 1759 TGA 1761 
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AB043997 5158 bp mRNA linear PRI 19-NOV-2000 

Homo sapiens mRNA for high-affinity choline transporter CHT1, 
complete cds . 
AB043997 

AB043997.1 GI:11231080 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (sites) 

Okuda,T. and Haga,T. 

Functional characterization of the human high-affinity choline 
transporter 

FEBS Lett. 484 (2), 92-97 (2000) 

20521663 

11068039 

2 (bases 1 to 5158) 
Okuda, T . 

Direct Submission 

Submitted ( 30-MAY-2000 ) Takashi Okuda, University of Tokyo, Faculty 
of Medicine, Department of Neurochemistry; 7-3-1 Hongo, Bunkyo-ku, 
Tokyo 1130033, Japan (E-mail : okuda@m. u-tokyo . ac . jp, 
URL :http : //park . ecc . u-tokyo . ac . jp/neurochemis try, 
Tel:81-3-5841-3560, Fax:81-3-6814-8154) 

Location/Qualifiers 

1. .5158 

/organism="Homo sapiens" 

/mol_type= M mRNA" 

/db_xref="taxon: 9606" 

/tissue_type="spinal cord" 

277. .2019 

/ codon_start=l 

/product="high-af finity choline transporter CHTl" 
/protein_id-"BAB18161. 1" 
/db_xref="GI: 11231081" 

/ translation="MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIV 
GGRDIGLLVGGFTMTATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFF 
AKPMRS KG YVTMLDP FQQI YGKRMGGLL FI PALMGEMFWAAAI FS ALGAT I S VI I DVD 
MHISVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI S VP FAL S H P AVAD I G FT 
AVHAKYQKPWLGTVDS S EVYSWLDS FLLLMLGGI PWQAY FQRVL S S S SAT YAQVLS FL 
AAFGCLVMAI PAI L I GAI GAS T DWNQT AY GL P D P KT T E EADMI LPIVLQYLCPVYI S F 
FGLGAVS AAVMS SAD S S I L S AS SMFARN IYQLS FRQNAS D KE I VWVMRI T VFVFGAS A 
T7\MALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGG 



EPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 
P K L D VFD AWARH S E ENMD KT I L VKN EN I K L D E LAL VK P RQ SMT L S S T FTN KEAFL D V 
DSSPEGSGTEDNLQ" 

ORIGIN 

Query Match 100.0%; Score 1743; DB 9; Length 5158; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


60 




1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


277 


ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


336 


Qy 


61 


GT T GGAAT AT GG GCT GCCT GGAGAAC CAAAAAC AGT GGC AGC GCAGAAGAGCGCAGC GAA 


120 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


337 


GTT GGAAT AT GGGCT GCCT G GAGAAC CAAAAACAGT GGC AGC GCAGAAGAGC G CAG C GAA 


396 


Qy 


121 


GC CAT CAT AGTT G GT GGC C GAGAT AT T GGT TT ATT GGTT GGT GGAT T TAC CAT GAC AGCT 


180 




I I I I I I I 1 1 1 1 1 1 II 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


397 


GC CAT CAT AGTT GGT GGCC GAGAT AT T G GT TT ATT GGTT G GT GGAT T TAC CAT GAC AGCT 


456 


Qy 


181 


ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 


240 


Db 


457 


I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AC CT GGGT C GGAGGAGGGT AT AT CAAT GG CAC AGCT GAAGCAGTTT AT GT AC C AGGTTAT 


516 


Qy 


241 


GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 


300 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


517 


GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 


576 


Qy 


301 


T T CT T T G C AAAAC CT AT G C GT T C AAAG G G GT AT GT GAC CAT GT T AG AC C C GT T T CAG C AA 


360 




1 I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 M 




Db 


577 


T T CT T T GCAAAAC CT AT GC GT T CAAAGGG GT AT GT GAC CAT GT T AGAC C C GT T T CAGCAA 


636 


Qy 


361 


AT CTAT GGAAAACGCAT GGGC GGACT CCT GTT T ATT CCT GCACTGAT GGGAGAAAT GTT C 


420 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


637 


AT CT ATGGAAAACGCATGGGCGGACT CCT GTTTATTCCT GCACTGATGGGAGAAAT GTT C 


696 


Qy 


421 


TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 


480 




I | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


697 


TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 


756 


Qy 


481 


AT GC ACAT TT CT GTCAT CAT CT CT GCACT CAT T GCCACT CT GTAC ACACT GGT GGGAGGG 


540 




I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


757 


ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGT GGGAGGG 


816 


Qy 


541 


CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 


600 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


817 


CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 


876 


Qy 


601 


ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 


660 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


877 


ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 


936 


Qy 


661 


CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 


720 




I I I I 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


937 


CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 


996 



Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 II 

Db 997 CT TGAT AGT T T T CT GT T GT T GAT GCT GGGT G GAAT C CC AT GGCAAG CAT ACT T T CAGAGG 1056 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1057 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 1116 

Qy 841 T G C CT GGT GAT G GC CAT C C CAGC CAT ACT CAT T GG GGC C AT T G GAGC AT CAACAGACT G G 900 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 
Db 1117 TGCCT GGT GAT GGCCATCCCAGCCATACT CATT GGGGCCATT GGAGCAT CAACAGACTGG 1176 

Qy 901 AACCAGACTGCATAT GGG CT T C CAGAT C C CAAGACT ACAGAAGAGGCAGACAT GAT TTTA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1177 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 1236 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1237 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 12 96 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCT^AGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 

Db 1297 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1356 

Qy 1081 C G GAAC AT C T AC C AG CTTTCCTT C AG AC AAAAT G C T T C G G AC AAAG AAAT CGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1357 C GGAACAT CT AC CAGCTT T CC T T CAGACAAAAT GCT T CGGACAAAGAAAT CGT T T GGGT T 1416 

Qy 1141 AT GCGAAT CACAGTGTTT GT GTTT GGAGCAT CTGCAACAGC CAT GGCCTT GCT GACGAAA 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1417 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1476 

Qy 12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1477 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1536 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1537 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1596 

Qy 1321 T CTGGCCTCTTC CTGAGAATAACTGGAGGGGAGCCATATCT GTAT CTTCAGCCCTT GAT C 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1597 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1656 

Qy 1381 T T CT AC C C T G GC T AT T AC C CT GAT G AT AAT G G T AT AT AT AAT C AGAAAT T T C CAT T T AAA 14 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1657 TT CT AC C CT GGCT AT T AC C CT GAT GAT AAT GGTAT ATATAAT C AGAAATT T C CATT TAAA 1716 

Qy 1441 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1717 ACACT T GC CAT GGTT ACAT CAT T CT T AAC CAACAT T T GC AT CT CCT AT CT AGC CAAGT AT 177 6 

Qy 1501 CT AT T T GAAAGT G GAACC T T GC C AC CT AAAT T AGAT GTAT T T GAT GCT GT T GTT G C AAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1777 CT AT T T GAAAGT G GAAC CT T GC C AC CT AAAT T AGAT GTAT T T GAT G CT GT T GT T G C AAGA 1836 

Qy 1561 C ACAGT GAAGAAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 



Db 



1837 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

C AC AGT G AAGAAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 18 96 



Qy 1621 GAACTT GCACTTGT GAAGCCACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1897 GAACT T GCACTT GT GAAGC CAC GACAGAGCAT GACC CT C AGCT CAACTTT CAC CAAT AAA 1956 

Qy 1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 
Db 1957 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 2016 

Qy 1741 TGA 1743 

I I I 

Db 2017 TGA 2019 
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ORIGIN 



AR268949 1743 bp DNA 

Sequence 1 from patent US 6500643. 
AR268949 

AR268949.1 GI:29699686 

Unknown . 

Unknown. 

Unclassified. 

1 (bases 1 to 1743) 

Wu,D.-H., Gu,Y., Millard, W.J. and He,Y.-J. 

Human high affinity choline transporter 
Patent: US 6500643-A 1 31-DEC-2002; 

Location/ Qualifiers 

1. .1743 

/ organism="unknown" 
/mol_type=" genomic DNA" 



linear PAT 10-APR-2003 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 1740; Conservative 



Score 1738.2; DB 6; 
Pred. No. 0; 
0; Mismatches 3; 



Length 1743; 
Indels 0; Gaps 



0; 



Qy 

Db 



ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



61 GT T G GAAT AT GG G CT G C CT G GAGAAC CAAAAAC AGT G GC AGC G C AGAAGAGCGCAGC GAA 120 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I 
61 GT T GGAAT AT GG GCT GCCT GGAGAAC CAAAAACAGT G GC AGCGCAGAAGAGCG CAGC GAA 120 

121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 GC CATCATAGTT GGTGGCCGAGATATT GGTTTATTGGTT GGTGGAT TTACCAT GACAGCT 180 

181 AC CT GGGT C G GAGGAGGGT ATAT C AAT GGCACAGCT GAAG CAGT TTAT GT AC CAGGT TAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 AC CT G G GT C GGAGGAGGGTATAT CAAT GGCACAGCT GAAG CAGT TTAT GT AC CAGGT TAT 240 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

3 01 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

361 AT CT ATGGAAAAC GCAT G GGC GGACT CCT GT T TAT T CCT GC ACT GAT GGGAGAAAT GT T C 420 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I 
361 AT CT AT GGAAAAC G CAT GGGC G GACT C CT GT T TAT T C CT GCACT GAT G GGAGAAAT GT T C 420 

421 T G GGCT GCAGCAAT T T T CT CT GCTT T GG GAGC CAC CAT CAGC GT GAT CAT C GAT GT GGAT 4 80 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 80 

481 AT GC ACATTT CT GT CAT CAT CT CT GCACT CAT T GC CAC T CT GT ACACACT GGT GG GAG GG 540 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

601 AT CAGC GTCCCCTTTG CAT T GT CAC AT C CT G C AGT C G C AG AC AT C GG GT T C ACT G C T GT G 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I 
661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

8 41 T GC CT GGT GAT GG C CAT CC CAGC CATACT CATT G G GGC CAT T GGAGC AT CAACAGACT G G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I 
841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCCTCCACAGACTGG 900 

901 AAC C AGAC T G CAT AT G G GCT T C C AG AT C C C AAG AC T AC AGAAG AG GC AGAC AT GAT T T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 AAC C AGACT GCATAT GGGCT T C CAGAT C C C AAGACT AC AGAAGAG GC AGACAT GATT T T A 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

1021 TCTGCTGCTGT TAT GT CAT C AGC AGAT T CT T C CAT CT T GT C AGC AAGT T C CAT GT T T G C A 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 



1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 
| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 12 00 

I I | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 AT G C GAAT CACAGT GTT T GT GTTT GGAGCAT CT G C AAC AGC CAT G G C CT T GCT GACGAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I i I I I I I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGG7WVCCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I 
Db 1321 T CT GGC CT C T T CCT GAGAAT AACT GGAGGG GAGC CAT AT CT GT AT CT T CAGCCCTT GAT C 1380 

Qy 1381 T T C T AC C CT G GC T AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AG AAAT T T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 TT CTACCCT GGCTATTACCCTGAT GAT AAT GGTATAT ATAATCAGAAATTT CCATTTAAA 144 0 

Qy 1441 AC ACTT GC C AT G GT T AC AT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT CT AGC C AAGT AT 1500 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ACACTT GCCATGGTTACAT CATT CTTAAC CAACATTT GCAT CT CCTAT CT AGC CAAGTAT 1500 

Qy 1501 CT ATTT GAAAGT GGAAC CTT GC CACCT AAAT T AGAT GT AT T T GAT GCT GT T GT T GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1501 CT ATT T GAAAGT GGAAC CT T GC CACCT AAAT T AGAT GT AT T T GAT GCT GT T GTT GCAAGA 1560 

Qy 1561 CACAGT GAAGAAAAC AT GGAT AAG AC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 162 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 CACAGT GAAGAAAAC AT G GAT AAG AC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

Qy 1621 GAACT T GC ACT T GT GAAGC C ACGAC AGAGCAT GAC C CT C AGCT C AACTT T C AC C AAT AAA 1680 

I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I I I I II I I II I I I I I II II I I I I I I 
Db 1621 GAACTT GCACTT GTGAAGCCACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1680 

Qy 1681 GAGGCCTT CCTT GAT GTT GATTCCAGT CCAGAAGGGTCT GGGACT GAAGAT AATTTACAG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGCCTTCCTT GAT GTTGATT CCAGTCCAGAAGGGT CT GGGACTGAAGATAATTT ACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 7 

E49870 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



E49870 1743 bp DNA 

High-affinity choline transporter. 
E49870 

E49870. 1 GI:22554901 
JP 2001136976-A/2. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; 



linear PAT 27-AUG-2002 



Vertebrata ; Euteleos tomi ; 



Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

REFERENCE 1 (bases 1 to 1743) 
AUTHORS Haga,T. and Okuda,T. 
TITLE High-affinity choline transporter 

JOURNAL Patent: JP 2001136976-A 2 22-MAY-2001; 
SCIENCE & TECH AGENCY 
COMMENT OS Rattus sp . (rat) 

PN JP 2001136976-A/2 
PD 22-MAY-2001 
PF 27-DEC-1999 JP 1999368991 
PI TATSUYA HAG A, TAKASHI OKU DA 

PC C12N15/09,A01K67/027,A61K38/00 / C07K14/47,C07K16/18,C07K19/00, 
PC C12N5/10, 

PC C12P21/02,C12P21/08,C12Q1/00, C12N15/00, A61K37/02, C12N5/00 CC 
FH Key Location/Qualifiers 
FT CDS (1). .(1743). 

FEATURES Location/Qualifiers 
source 1. .1743 

/organism="Rattus sp . " 
/mol_type="genomic DNA" 
/db_xref="taxon: 10118" 

ORIGIN 

Query Match 80.0%; Score 1394.2; DB 6; Length 1743; 

Best Local Similarity 87.5%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 218; Indels 0; Gaps 0; 

ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Ml I I I I I I I I I I I I I II I I I I I I I I I Ml I II I I I I I I II I I II II III 

ATGCCTTTCCATGTAGAAGGACTAGTAGCGATTATCCTGTTCTACCTTCTTATATTTCTG 60 

GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M 

GTT GGAATAT GGGCT GCATGGAAAACCAAAAACAGCGGTAAT GCAGAAGAACGCAGCGAA 12 0 

GCCAT CATAGTT GGT GGCCGAGAT ATT GGTTT ATT GGTT GGTGGATTTACCAT GACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 180 

ACCT GGGT CGGAGGAGGGT AT AT CAAT GGCACAGCT GAAGCAGTTT AT GTACCAGGTT AT 240 

I I I II I I I M II Mill II I I I I I I I I I I I I I I I I I I I I I I II I I I 

ACCTGGGTTGGAGGAGGTTACATCAACGGGACAGCTGAAGCAGTTTATGGGCCAGGTTGT 240 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

M I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II II I I I 

GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 300 

T T CT T T GCAAAAC CT AT G C GT T C AAAGGG GT AT GT GAC CAT GT T AGAC C C GT T T CAG C AA 360 

II I I I I I I I II II I I II I I I II Mill II I II I M I I II I II I II I I I I I II II 

T T T T T T G C AAAAC C TAT GC GT T C C AAG G GAT AT GT GAC TAT GT TAG AC C C GT T T C AAC AG 360 
AT CT AT GGAAAACGCAT GGGCGGACTCCT GTTT ATT CCT GCACT GAT GGGAGAAAT GTT C 420 

II I II I I I I I I II II Mill II I I I I II II II I I I I I M I I I M I 

AT CTAT G GAAAGC GC AT GGGTGGGCTGCTGTT C ATC C CT GCACT GAT GGGAGAGATGT T C 420 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 



Qy 



421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 



I I I I I I I I I I 1 I I I I I I I I I I I I I I II II I I I I I I I I I I I Mill I I I I I I I I I 

Db 421 TGGGCTGCAGC/^lTTTTCTCTGCATTAGGGGCTACCATCAGCGTAATCATTGATGTGGAT 480 

Qy 481 AT GC AC AT T T CT GT CAT CAT C T CT G C ACT CAT T G C CACT C T GT AC ACAC T GGT GGGAG GG 540 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I III II II II I I I I I I I I I 
Db 4 81 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTACAGCTATTCTGCATTTTTATAGGATTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 ATCAGTGTCCCATTTGCCCTGTCACATCCTGCAGTCACCGACATTGGATTCACTGCTGTG 660 

Qy 661 CAT GCCAAATAC CAAAAGC C GT GG CT G GGAACT GT T GACT CAT CTGAAGT CT ACT CT T GG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 CAT G CTAAAT AC C AGAGT C C CT GGCT GGGAACC ATT GAAT CAGT TGAAGT CT ACACCT GG 72 0 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I i I I I I I I I I I I I 

Db 721 CTT GATAATTTTCTGTTGTT GAT GCT GGGT GGAATACCAT GGCAAGCCTACTTCCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 781 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 84 0 

Qy 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II II I I I I I I I I I 

Db 841 TGCCTGGTGATGGCTCTACCAGCCATTTGCATTGGGGCCATTGGAGCCTCCACAGACTGG 900 

Qy 901 AAC C AGACT GC AT AT GGG CT T C C AGAT C C C AAGACT AC AGAAGAGGC AGAC AT GAT T T T A 960 

I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 901 AACCAAACTGCATATGGGTTTCCAGATCCCAAGACCAAGGAGGAAGCAGACATGATTCTC 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I II I I I I 

Db 961 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGT TAT GT CAT C AGC AGAT T CT T CC AT CT T GT C AG C AAGT T CC AT GTT T GCA 1080 

I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 TCTGCTGCTGTCATGTCCTCGGCTGACTCATCCATCCTATCAGCAAGTTCCATGTTTGCT 1080 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 

I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I II II I I I I I 

Db 1081 C GGAAT AT CT ACC AGCT TT CCTT CAGACAAAAT G CAT CAGAC AAGGAAAT T GT GT GGGT C 114 0 

Qy 1141 AT GC GAAT C AC AGT GT TTGTGTTTG GAGC AT CT G C AAC AG C CAT GGCCTTGCT GAC GAAA 1200 

III I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 ATGAGGATCACTGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTCACGAAG 12 00 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I III 
Db 1201 ACT GT GTAT GGGCT CT GGT ACCT GAGCTCT GACCTT GT CT ACAT CAT CAT CTTCCCACAG 1260 



Qy 1261 



CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II II I I I I I I II 



Db 



12 61 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 



Qy 1321 T CT GGCC T CT T CC T GAGAAT AACT GGAG GG GAG C CAT AT CT GT AT CT T CAGC C CTTGAT C 1380 

I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I III 
Db 1321 T TT GGAC T T T T CC T GAGAAT T AC C GGAG GAGAGC CAT AT CT AT AC TT GC AGC CCT TAAT C 1380 

Qy 1381 T T C T AC C CT G G CT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AG AAAT T T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 T T CT AC C CT GGT TAT T AC CCT GAC AAGAAT G GT AT AT AC AAT C AGAG GT T C C C ATTT AAA 1440 

Qy 14 41 ACACTT GCCAT GGTTACATCATTCTTAACCAACATTT GCAT CTC CTAT CTAGCCAAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCCTATCTAGCCAAGTAT 1500 

Qy 1501 CTAT T T GAAAGT G GAAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT GCTGTTGTTG CAAGA 1560 

I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I II I 
Db 1501 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATATATTTGATGCTGTTGTCTCAAGG 1560 

Qy 1561 C AC AGT G AAGAAAAC AT G GAT AAGACAAT T CTT GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 
Db 1561 CACAGT GAAGAGAACAT GGAC AAGAC CAT T CTAGT CAGAAAT GAAAACAT CAAATTAAAT 1620 

Qy 1621 GAACTT GCACT T GT GAAGC C AC GAC AGAGCAT GACC CT C AGCT CAACTTT C AC CAAT AAA 1680 

I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAACTT GC AC CTGTAAAGC CTC GACAGAGC CT AAC CCT CAGT T CAACT TT CAC CAAT AAA 1680 

Qy 1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGCT CT C CTT GAT GT T GAT T C CAGT CC AGAG GGAT CT GGGACT GAAGAT AACT T ACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 8 
BD012718 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



BD012718 1743 bp DNA linear PAT 02-AUG-2002 

High-affinity choline transporter. 

BD012718 

BD012718. 1 GI: 22092907 
WO 0116315-A/2. 

Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 1743) 
Haga,T. and Okuda,T. 
High-affinity choline transporter 
Patent: WO 0116315-A 2 08-MAR-2001; 

JAPAN SCIENCE AND TECHNOLOGY CORP, TATS UYA HAG A , T AKAS H I OKU DA 
OS Rattus norvegicus (rat) 
PN WO 0116315-A/2 
PD 08-MAR-2001 

PF 18-AUG-2000 WO 2000JP005545 

PR 27-AUG-1999 JP 99P 240642 , 27-DEC-1999 JP 99P 368991 PI 



TATSUYA HAGA, TAKASHI OKU DA 

PC Cl2N15/12,C07K14/47,C12Ql/68,C07K19/00, C07K16/18 , C12N5/10, PC 
A61K38/17, 

PC A61K45/00,A61P25/28, G01N33/53, A01K67/027 
CC 

FH Key Location/Qualifiers 
FT CDS (1). .(1743). 

FEATURES Location/Qualifiers 
source 1. .1743 

/organism= n Rattus norvegicus" 
/mol type=" genomic DNA" 
/db_^ref="taxon: 10116" 

ORIGIN 

Query Match 80.0%; Score 1394.2; DB 6; Length 1743; 

Best Local Similarity 87.5%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 218; Indels 0; Gaps 0; 

ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Ml I I I I I I I I I I I I I I I II I II I I II III I I I I I I I I I I I I I II II I I I 

ATGCCTTTCCATGTAGAAGGACTAGTAGCGATTATCCTGTTCTACCTTCTTATATTTCTG 60 

GTT GGAAT AT GGG CT GC CT GGAGAAC CAAAAAC AGT G GC AGCGCAGAAGAGC GC AG C GAA 12 0 
I I I I I I I I I I I I I I II I I I II II I I I I I I I I I I II I I I I I I I I I I I I II I I I I 
GT T GGAAT AT GGG CT GCAT GGAAAAC CAAAAAC AGC GGTAAT G CAGAAGAAC GCAGC GAA 12 0 

GC CAT CAT AGTT GGT GGCC GAGATAT T GGT TT AT T GGT T GGT GGAT TT AC CAT GAC AGCT 18 0 

I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 

AC CT GGGT C GGAGGAG GGT AT AT CAAT GGCACAGCT GAAGCAGT TT AT GT ACCAGGT TAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I 

ACCTGGGTTGGAGGAGGTTACATCAACGGGACAGCTGAAGCAGTTTATGGGCCAGGTTGT 24 0 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 300 

TT CT T T GC AAAAC CT AT GC GTT C AAAGGG GT AT GT GAC CAT GT T AGAC C C GT T T C AGCAA 360 
II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I II 
T T TTT T GCAAAAC CT AT GC GT T C CAAGGGATAT GT GACT AT GT T AGACC C GT TT CAACAG 360 

AT CTAT GGAAAAC GCAT GGGCGGACT CCT GTTTATT C CT GCACT GAT GGGAGAAAT GTTC 42 0 
I I I I I I I I I I I I I I I I I I I II M Mill II I I I I I II I I I I I I I I M I I I I I I 
AT CTAT GGAAAGCGCAT GGGT GGGCTGCT GTT CAT CCCT GCACT GAT GGGAGAGAT GTT C 420 

T GGGCT GCAGCAATT T T CT CT GCTT T GGGAGC CAC CAT C AGC GT GAT CAT CGAT GT GGAT 4 8 0 
M | I I I I I I I II I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I 
T GGGC T GCAGCAAT T T T CT CT G CAT T AGGGG CTAC CAT C AGC GT AAT C ATT GAT GT GGAT 48 0 

AT GCACATTT CT GT CAT CAT CT CT GCACT CATT GCCACT CT GT ACACACT GGT GGGAGGG 54 0 

II I I I I II II I I I MM II I I I II II II II Ml II II II II I II I II I 

GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 54 0 
CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Mill II I II I M II I II M I M II II I II II II I II II M II II Mill 

CTCTACTCTGTGGCATATACTGATGTTGTACAGCTATTCTGCATTTTTATAGGATTGTGG 600 



Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



601 AT CAGC GT CCCCTTTG CAT T GT CACAT C CT G C AGT C G CAGACAT C GGGT T C ACT G CT GT G 

I I I I I II | | I I I I I I I I I I I I I I I I I I I I I I I I I I" I I I I I I I I 

601 ATCAGTGTCCCATTTGCCCTGTCACATCCTGCAGTCACCGACATTGGATTCACTGCTGTG 

661 CAT GC CAAAT AC CAAAAGC CGT GGCT GG GAAC T GT T GAC T CAT CT GAAGT CT ACT CT T G G 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
661 CATGCTAAATACCAGAGT CCCT GGCT GGGAACCATT GAAT CAGTT GAAGTCTACACCTGG 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 

I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
721 CTTGATAATTTTCTGTTGTTGATGCTGGGTGGAATACCATGGCAAGCCTACTTCCAGAGG 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 

II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I III 

781 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 

841 T GC CT GGT GAT GG C CAT C C CAGC CAT ACT CAT T G G G G C CAT T G GAGC AT C AAC AGACT G G 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I 

841 TGCCTGGTGATGGCTCTACCAGCCATTTGCATTGGGGCCATTGGAGCCTCCACAGACTGG 

901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 
901 AACCAAACT G CAT AT GGGT TT C CAGAT C C CAAGAC CAAGGAGGAAGCAGACAT GAT T CT C 



660 



660 



720 



720 



780 



780 



840 



840 



900 



900 



960 



960 



961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

II MINIM I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I II I I I I I I I I III 

961 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I II I I I I I I I I I I I I II II II II MINI I I I I I I I I I I I I I I I I I I I I I 

1021 TCTGCTGCTGTCATGTCCTCGGCTGACTCATCCATCCTATCAGCAAGTTCCATGTTTGCT 1080 

1081 C GGAAC AT CT AC CAGCT T T C CT T C AGACAAAAT GCT T C G GACAAAGAAAT C GT T T G GGT T 1140 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I II I I I I I I I I I I II I I I I I 

1081 C G GAAT AT C T AC C AG CTTTCCTT C AG AC AAAAT G CAT C AG AC AAG G AAAT TGTGTGGGTC 1140 

1141 AT GCGAAT CACAGT GTTT GT GTTT GGAGCAT CT GCAACAGCCATGGCCTT GCT GACGAAA 1200 

Ml I Mill I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I Mill 

1141 ATGAGGATCACTGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTCACGAAG 1200 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I III 

1201 ACT GT GT AT GGGCT CT G GT AC CT GAGCT CT GACCT T GT CT ACAT CAT CAT CT T C C CAC AG 1260 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I II II I I I I I I M 

1261 CT GCT CT GT GT ACT CTT C AT CAAAGGAACCAACACTT AT GGGGCAGTT GCT GGTTATATT 1320 

1321 T CT GGCCT CTT CCT GAGAATAACT GGAGGGGAGCCAT AT CT GT AT CTT CAGCCCTT GAT C 1380 

I I I I II II I II I I I I I I II I I I I I II I I I I I I I I I II I I I I I I I I I III 

1321 T TT GGACT TT T CCT GAGAAT T AC C GGAGGAGAG C CAT AT CT AT ACTT GCAGC C CT T AAT C 1380 

1381 T T C T AC C C T G GCT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I 

1381 TTCTACCCTGGTTATTACCCTGACAAGAATGGTATATACAATCAGAGGTTCCCATTTAAA 144 0 



Qy 

Db 



1441 ACACTT GC C AT GGT TAC AT C AT TCT T AAC C AACAT TT G CAT CT C C TAT CTAGCCAAGTAT 1500 

II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
14 41 ACT CT CT C CAT GGT TAC CT C AT TCT TT ACCAAC AT TT GT GT T T C CT AT CTAG CCAAGT AT 1500 



Qy 1501 CTAT T T GAAAGT GGAAC CTT GC C AC CT AAAT T AGAT GT ATT T GAT GCT GTT GTT G CAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1501 CTATTTGAAAGTGGAACCTTGCCTCCA7\AATTAGATATATTTGATGCTGTTGTCTCAAGG 1560 

Qy 1561 CACAGT GAAGAAAACAT GGATAAGACAATT CTT GT CAAAAAT GAAAAT ATT AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I. I I I I I I I I II I I I II I I I I I I II 
Db 1561 CACAGT GAAGAGAACAT GGACAAGACCATT CT AGT CAGAAAT GAAAACAT CAAATT AAAT 1620 

Qy 1621 GAACTT GCACTT GT GAAGCCACGACAGAGCAT GAC CCT CAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 GAACTTGCACCTGTAAAGCCTCGACAGAGCCTAACCCTCAGTTCT^CTTTCACCAATAAA 1680 

Qy 1681 G AGG C CT T C CT T GAT GT T GAT T C C AGT C C AGAAGG GT CT GGGACT GAAGAT AAT TT AC AG 174 0 

I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I I i I I I I I I I I I I I I 
Db 1681 GAGGCT CT CCT T GAT GT T GAT T CC AGT C CAG AGG GAT CT GGGACT GAAGAT AAC TTACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 
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LOCUS 

DEFINITION 
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KEYWORDS 
SOURCE 
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AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
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COMMENT 
FEATURES 

source 



AB030947 4904 bp mRNA linear ROD 03-FEB-2000 

Rattus norvegicus mRNA for high-affinity choline transporter CHT1, 
complete cds . 
AB030947 

AB030947.1 GI:6863033 

choline transporter; high-affinity choline transporter CHT1. 
Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Okuda, T., Haga,T., Kanai,Y., Endou,H., Ishihara,T. and Katsura,I. 
Identification and characterization of the high-affinity choline 
transporter 

Nat. Neurosci. 3 (2), 120-125 (2000) 

20116099 

10649566 

2 (bases 1 to 4904) 
Okuda, T. 

Direct Submission 

Submitted ( 09-AUG-1999 ) Takashi Okuda, University of Tokyo, Faculty 

of Medicine, Department of Neurochemistry; Hongo 7-3-1, Bunkyo-ku 

113-0033, Japan (E-mail : okuda @m. u-tokyo . ac . jp, Tel : +81-3-5841-3560, 

Fax:+81-3-3814-8154) 

Sequence updated ( 11- Jan-2000 ) . 

Location/Qualifiers 

1. .4904 

/organism="Rattus norvegicus" 
/mol_t ype= "mRNA" 



CDS 



/strain="Wistar" 
/db_xref="taxon: 10116" 
/clone="CHTl" 

/tissue_type=" spinal cord" 

/clone_lib="rat spinal cord cDNA library" 

/dev_stage="adult" 

224. .1966 

/ codon start=l 

/product="high-af finity choline transporter CHT1" 
/protein_id="BAA90484 . 1" 
/db_xref="GI : 6863034" 

/ trans la tion= "MP FHVEGLVAI I LFYLLI FLVGI WAAWKTKNSGNAEERSEAI IV 
GGRDI GLLVGGFTMTATWGGGYINGT7VEAVYGPGCGLAWAQAPI GYSLS LI LGGLFF 
AKPMRSKGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVD 
VNISVIVSALIAILYTLVGGLYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFT 
AVHAKYQS PWLGTI ES VEVYTWLDNFLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FL 
AAFGCLVMALPAICIGAIGASTDWNQTAYGFPDPKTKEEADMILPIVLQYLCPVYISF 
FGLGAVS AAVMS SAD S S I L S AS SMFARN I YQLS FRQNAS DKE I VWVMRI TVFVFGAS A 
TAMALLTKTVYGLWYLSSDLVYIIIFPQLLCVLFIKGTNTYGAVAGYIFGLFLRITGG 
EPYLYLQPLIFYPGYYPDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKYLFESGTLP 
PKLDIFDAWSRHSEENMDKTILVRNENIKLNELAPVKPRQSLTLSSTFTNKEALLDV 
DSSPEGSGTEDNLQ" 



ORIGIN 



Query Match 80.0%; 
Best Local Similarity 87.5%; 
Matches 1525; Conservative 



Score 1394.2; 
Pred. No. 0; 
0; Mismatches 



DB 10; 

218; Indels 



Length 4904; 

0 ; Gaps 



0; 



Qy 



Db 



224 



ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 
III I I I I I I I I II I I I I I I I I I I I I II III I I I I I I I I I I I I I II II III 
ATGCCTTTCCATGTAGAAGGACTAGTAGCGATTATCCTGTTCTACCTTCTTATATTTCTG 283 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 GT T GGAAT ATGG GCT GC CT GGAGAAC CAAAAACAGT GGCAGC G CAGAAGAGC G CAGC GAA 12 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
284 GTTGGAATATGGGCT GCAT GGAAAACCAAAAACAGCGGTAAT GCAGAAGAACGCAGCGAA 343 

121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 18 0 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
344 GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 403 

181 AC CT GGGT C GGAGGAGG GT AT AT C AAT G GC AC AGC T GAAG C AGT T TAT GT AC C AGGT TAT 24 0 

I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

404 ACCTGGGTTGGAGGAGGTTACATCAACGGGACAGCTGAAGCAGTTTATGGGCCAGGTTGT 463 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

464 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 523 

301 T T CT T T GCAAAAC CT AT GC GT T C AAAGG GGT AT GT GAC CAT GT T AGAC C C GT T T C AG C AA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
524 T T T T T T G C AAAAC C T AT G C GT T C C AAG G GAT AT G T GAC TAT G T TAG AC C C G T T T C AAC AG 583 

361 AT CT AT GGAAAACGCAT GGGCGGACTCCTGTTT ATT CCT GCACT GAT GGGAGAAAT GTT C 420 

I I I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
584 AT CT AT GGAAAG C GCAT GGGT GG GCT GCT GTT CAT C C CT GCAC T GAT GGGAGAGAT GTT C 643 



Qy 



421 



TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 



Db 



I I I I I I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I I I I I I I I I I I I I 

644 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCTACCATCAGCGTAATCATTGATGTGGAT 703 



Qy 4 81 AT GCACAT T T CT GT CAT CAT CT CT GCACT C ATT GCC ACT CT GT AC AC ACT GGT G GGAGGG 540 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 7 04 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 7 63 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 64 CT C T AC TCT GT GGCAT AT AC T GAT GT T GTACAGCTAT T CT GCAT TT T TAT AGGAT T GT GG 823 

Qy 601 AT CAGC GT C C C CT TT GC ATT GT C ACAT C CT GCAGT CGC AGACAT C G GGTT CACT GCT GT G 660 

I II I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I II I I I I I 
Db 824 AT C AGT GT C C CAT TTGCCCTGT C ACAT C CT GCAGT C AC C GACAT T G GAT T CACT GCT GT G 883 

Qy 661 CATGCCAAATACCA^AAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 8 84 CAT GCTAAATAC CAGAGTC CCTGGCT GGGAACCATT GAAT CAGTTGAAGT CTACACCT GG 943 

Qy 721 CT T GAT AGT TTTCTGTTGTT GAT GCT GGGT GGAAT C C CAT GGC AAGCAT ACTT T CAGAGG 780 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 944 CT T GATAAT T T T CT GT T GT T GAT GCT GGGT GGAAT AC CAT GGCAAGC CTACTT C CAGAGG 1003 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1004 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 1063 

Qy 841 T GC CT GGT GAT G GC CAT C C CAGC CAT ACT C ATT GGGGC CATT GGAGC AT CAACAGACT GG 900 

I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I II 

Db 1064 T GC CT G GT GATG GCT CT AC CAGC CAT T T G C ATT GGGG C CAT T GGAGC CT C C ACAGACT GG 1123 

Qy 901 AAC C AGACT GC AT AT GGGCT T C C AGATCC CAAGACT ACAGAAGAGGCAGACAT GATT T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1124 AAC CAAACT GCAT AT GGGT T T C C AGAT C C CAAGAC CAAGGAGGAAG CAGACAT GAT T CT C 1183 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1184 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1243 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I II I I I I I I II II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1244 TCTGCTGCTGTCATGTCCTCGGCTGACTCATCCATCCTATCAGCAAGTTCCATGTTTGCT 1303 

Qy 1081 C G G AAC AT C T AC C AG CTTTCCTT C AG AC AAAAT G C T T C G G AC AAAG AAAT CGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I II I I I I I 
Db 1304 CGGAATATCTACCAGCTTT CCTT CAGACAAAAT GCAT CAGACAAGGAAATTGTGTGGGT C 1363 

Qy 1141 AT GC GAAT C ACAGT GT T T GT GT T T GGAG C AT CT GCAAC AGC CAT GGCCTTGCT GAC GAAA 12 00 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 

Db 1364 AT GAGGAT CACT GT GT TT GT GT T T GGAGC AT CT GCAAC AG CC AT GGC CT T GCT CAC GAAG 1423 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 in 

Db 1424 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1483 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II II I I I I I I II 



Db 



1484 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 



1543 



Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 4 T T T GGACTT T T C CT GAGAAT T AC CGGAGGAGAG C CAT AT CT AT ACT T GC AGC C CTTAAT C 1603 

Qy 1381 TTCTAC CCT GGCTATTACCCT GAT GATAAT GGTATAT AT AAT CAGAAAT TT CCATTT AAA 14 4 0 

I I I I I I I I I II I I I I I I I I I II I I I I I II I I I I I I I I I I II I I I I I I I I I I I 
Db 1604 T T C T AC C C T G GT TAT T AC CCT G AC AAGAAT G GT AT AT AC AAT C AG AG GT T C C CAT T T AAA 1663 

Qy 1441 AC AC T T G C CAT G GT T AC AT CAT T C T T AAC C AAC AT T T G CAT C T C C T AT C TAG C C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 1 I I I I I I I I 
Db 1664 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCCTATCTAGCCAAGTAT 1723 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 172 4 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATATATTTGATGCTGTTGTCTCAAGG 17 8 3 

Qy 1561 C AC AGT GAAGAAAAC AT G G AT AAG AC AAT T CT T GT C AAAAAT G AAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 

Db 1784 C AC AGT GAAGAGAAC AT G G AC AAGAC CAT T CT AGT CAGAAAT GAAAAC AT C AAAT T AAAT 1843 

Qy 1621 GAACTT GCACTT GTGAAGCCACGACAGAGCAT GAC CCTCAGCT CAACTTTCACCAATAAA 1680 

I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 1844 GAACTT GCACCT GTAAAGC CT C GAC AGAG C CT AAC C CT C AGT T CAACTT T CAC CAAT AAA 19 03 

Qy 1681 GAGGCCTT C CTT GAT GTT GATT CCAGT CCAGAAGGGT CT GGGACTGAAGATAATTT ACAG 1740 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1904 GAGGCT CTCCTT GAT GTTGATT CCAGTCCAGAGGGATCT GGGACTGAAGATAACTTACAA 1963 

Qy 1741 TGA 1743 

III 

Db 1964 TGA 1966 



RESULT 10 

AF276872 

LOCUS 
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REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
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JOURNAL 



AF276872 1743 bp mRNA linear ROD 28-FEB-2001 

Mus musculus sodium and chloride-dependent high-affinity choline 
transporter mRNA, complete cds . 
AF276872 

AF276872.2 GI : 13162669 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 {bases 1 to 1743) 

Apparsundaram, S . , Ferguson, S . M . and Blakely / R.D. 
Molecular cloning and characterization of human and murine 
high-affinity choline transporters 
Unpublished 

2 (bases 1 to 1743) 

Apparsundaram, S . , Ferguson, S .M. and Blakely,R.D. 
Direct Submission 

Submitted ( 09- JUN-2000 ) Department of Pharmacology and Center for 
Molecular Neuroscience, Vanderbilt University, 23rd Avenue South at 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 
FEATURES 

source 



CDS 



Pierce, Nashville, TN 37232-6420, USA 
3 (bases 1 to 1743) 

Apparsundaram, S . , Ferguson, S . M. and Blakely,R.D. 
Direct Submission 

Submitted (28-FEB-2001) Department of Pharmacology and Center for 
Molecular Neuroscience, Vanderbilt University, 23rd Avenue South at 
Pierce, Nashville, TN 37232-6420, USA 
Sequence update by submitter 

On Feb 28, 2001 this sequence version replaced gi: 11527247. 
Location/Qualif iers 
1. .1743 

/organism="Mus musculus" 

/mol_type= n mRNA n 

/db_xref="taxon: 10090" 

1. .1743 

/ codon_start=l 

/product="sodium and chloride-dependent high-affinity 
choline transporter" 
/protein_id="AAG36945 . 2 " 
/db_xref="GI : 13162670" 

/ translation="MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIV 
GGRDIGLLVGGFTMTATWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFF 
AKPMRS KGYVTMLDP FKQI YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I S VI I DVD 
VNISVIVSALIAILYTLVGGLYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFT 
AVHAKYQS PWLGT I ES VEVYTWLDNFLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FL 
AAFGCLVMALPAICIGAIGASTDWNQTAYGYPDPKTKEEADMILPIVLQYLCPVYISF 
FGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI VWVMRI TVLVFGASA 
TAMALLTKTVYGLWYLSSDLVYIIIFPQLLCVLFIKGTNTYGAVAGYIFGLFLRITGG 
EPYLYLQPLIFYPGYYSDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKYLFESGTLP 
PKLDVFDAWARHSEENMDKTILVRNENIKLNELAPVKPRQSLTLSSTFTNKEALLDV 
DSSPEGSGTEDNLQ" 



ORIGIN 



Query Match 78.9%; 
Best Local Similarity 86.8%; 
Matches 1513; Conservative 



Score 1375; DB 10; 
Pred. No. 0; 
0; Mismatches 230; 



Length 1743; 



Indels 



0; Gaps 



0; 



Qy 

Db 



1 AT GGCT TT C CAT GT GGAAG GACT GATAGCT AT CAT C GT GT T CTACCT T CT AAT T TT GCT G 60 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I II II II III 
1 ATGCCTTTCCATGTGGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 



Qy 

Db 



61 GT T GGAATAT GG G CT GC CT G GAGAAC CAAAAAC AGT G GC AGCGCAGAAGAG C GCAG C GAA 120 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I III 
61 GTT GGAATAT GG G CT G CAT G GAAAAC CAAAAAC AG C G GCAACC CAGAAGAGCGCAGT GAA 120 



Qy 



Db 



121 GCCAT CATAGTT GGT GGCC GAGAT AT T GGT T TAT T GGTT G GT GGAT T T AC CAT GACAGCT 180 

I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 



Qy 



Db 



181 AC CT G G GT C G GAGGAG GGT AT AT CAAT GGC AC AGCT GAAG CAGT TT AT GT ACC AGGT TAT 24 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 ACCTGGGTTGGAGGAGGCTACATCAATGGGACAGCAGAAGCAGTGTATGGGCCAGGTTGT 24 0 



Qy 

Db 



241 
241 



300 



300 



Qy 301 TTCTTTGCAA7\ACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 TTTTTTGC GAAAC C T AT G C GT T C C AAG G GAT AT GT G AC TAT GT TAG AC C CAT T C AAAC AG 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

II I I I I II II II M M I I I I I II I I I I I M 

Db 361 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 420 

Qy 421 T G G GCT GCAGCAATT TT CT CT GCT T T GGGAGCCACCAT C AG CGT GAT CAT C GAT GT GGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I II I I II 
Db 421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 480 

Qy 4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 540 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I III II II II I I I I I III 

Db 481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I I I I I I I I Mill II I I I I I II I I I I I I I I I I I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG' 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 661 CATGCTAAATACCAGAGTCCCTGGCTGGGAACCATTG?LATCAGTTGAAGTCTACACCTGG 720 

Qy 721 CT T GAT AGT T TT CT GT T GTT GAT GCT G GGT G GAAT CC C AT GGCAAGC ATACTT T CAGAGG 780 

I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I II II I I I I I I I I I I I I II I 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I II I II I I I I I I I I I I I I I I I I I III 

Db 781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 840 

Qy 841 T GCCTGGTGAT GGCCAT CCCAGCCAT ACT CATT GGGGCCATT GGAGCAT CAACAGACT GG 900 

I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I II I I I 
Db 841 TGCCTGGTGATGGCTCTACCCGCCATATGCATAGGAGCTATTGGAGCTTCCACAGACTGG 900 

Qy 901 AAC CAGACT GCAT AT GGGCT T CCAGAT CCCAAGACT ACAGAAGAGG C AGACAT GAT TTT A 960 

I I I I I I I I I I I II III I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 901 AAC CAGACT GCCT AC GGGT AT C CAGAT CC C AAGACT AAGGAGGAAGCAGAC AT GAT T CT C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I II I I I I I I I I II II I I I I I I I I I I I I I I I I III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 108 0 

II I I II I I I I I I I I I I I I I I II II I I I II I I I I I II I I I I I I I I I I I I I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 C G GAAC AT CT AC C AGCT T T C CT T C AGACAAAAT GCTT C G GAC AAAGAAAT C GT T T G GGT T 114 0 

I I I I I I I I II I II I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
Db 1081 C GGAAT AT CTAC CAGCTTT C CT T C AGACAAAAT GCAT CAGACAAGGAAAT T GT GT GGGT C 1140 



Qy 1141 AT GC GAAT CACAGT GT T T GT GT T T GGAGCAT CT GCAACAGC CAT GGCCTTGCT GAC GAAA 12 00 



Db 



Ml I I I I I I III I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 AT GAGGAT CACT GTGCTTGTGTTC GGAGCAT CT GCAACAG C CAT GGCTTTGCT GAC GAAG 1200 



Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 1201 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1321 TTT GGAC TAT T C CT GAGAAT TACT GGAGGAGAG C CAT AT CT AT ACT T GCAGCC CT T AAT C 1380 

Qy 1381 TT CT ACCCT GGCT ATT AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAATTT CCATT TAAA 14 4 0 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 TT CTACCCT GGTT ATT ACT CT GACAAGAAT GGT ATATACAATCAGAGGTT CCCATTTAAA 1440 

Qy 1441 AC ACT T G C CAT G GT T AC AT CAT T CTT AAC C AAC AT TT G CAT CT C CT AT CT AG C C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ACT CT C T C CAT GGT T AC C T CAT T C T T T AC CAAC AT TTGTGTTTCT TAT C T AGC C AAGT AT 1500 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I II I I I I ! I I I I I I I I I I I I 
Db 1501 CT AT T T GAAAGT GGAACCT T GCCT CCAAAATT AGAT GT AT T T GAT GCT GTT GT C GCAAGG 1560 

Qy 1561 CACAGT GAAGAAAAC AT GGAT AAGACAAT T CTT GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1561 CACAGT GAAG AGAAC AT G G AC AAGAC CAT T CT AGT C AGAAAT GAAAAT AT C AAAT T AAAT 162 0 

Qy 1621 G AAC T T G C AC T T GT GAAG C C AC G AC AGAG CAT GAC C C T C AG C T CAAC TTT C AC C AAT AAA 1680 

I I I I I I I I I I I I I I I I II M I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I 
Db 1621 GAACTTGCACCTGTGAAACCTCGGCAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAG 168 0 

Qy 1681 GAGGC CT T C CT T GAT GT T GAT T C C AGT CC AGAAG G GT CT GG GACT G AAGAT AAT T T ACAG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGC CCTC CTT GAT GTTGATTCCAGT CCGGAGGGGT CT GGGACTGAAGATAATTTACAA 17 4 0 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 11 

E49872 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



E49872 1743 bp 

High-affinity choline transporter. 
E49872 

E49872.1 GI:22554903 
JP 2001136976-A/4. 
Mus sp . 
Mus sp. 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 1743) 
Haga,T. and Okuda,T. 



DNA 



linear PAT 27-AUG-2002 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi ; Muridae; Murinae; Mus, 



TITLE High-affinity choline transporter 

JOURNAL Patent: JP 2001136976-A 4 22-MAY-2001; 

SCIENCE & TECH AGENCY 
COMMENT OS Mus sp . (mouse) 

PN JP 2001136976-A/4 

PD 22-MAY-2001 

PF 27-DEC-1999 JP 1999368991 

PI TATSUYA HAGA, TAKASHI OKU DA 

PC C12N15/09,A01K67/027,A61K38/00,C07K14/47, C07K16/18 , C07K19/00 , 
PC C12N5/10, 

PC C12P21/02,Cl2P21/08,C12Ql/00,C12N15/00,A61K37/02,C12N5/00 CC 
FH Key Location/Qualifiers 
FT CDS (1) . . (1743) . 

Location/Qualifiers 
1. .1743 
/organism="Mus sp." 
/mol_type=" genomic DNA" 
/db_xref="taxon: 10095" 

ORIGIN 



FEATURES 

source 



Query Match 78.8%; 
Best Local Similarity 86.7%; 
Matches 1512; Conservative 



Score 1373.4; 
Pred. No. 0; 
0; Mismatches 



DB 6; Length 1743; 

231; Indels 0; Gaps 



0; 



Qy 

Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Ml I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I II II II Ml 

1 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I III III 
61 GT T G GAAT AT G GGCT GCAT GGAAAAC C AAAAACAGC G G CAAC C CAGAAGAGC ACAGT GAA 12 0 

121 GC CAT CAT AGT T GGT GGC C GAGAT AT T GGT TT ATT GGTT G GT GGAT T T AC CAT GAC AGCT 18 0 
I I II I I I I I I I II I I I I I II I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 GC CAT CAT AGT C GGGGGC C GT GACAT T G GT TT GTT GGTT GGT GGTTT T AC CAT GAC AGC C 18 0 

181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I II I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 

181 AC CT G GGT T GGAGGAG GCT ACAT CAAT GG GAC AGC AGAAGCAGT GT AT GGGC C AG GT T GT 24 0 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II II II I I I III 

241 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

301 TTCTTTGC7VAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I II 

301 T T TT T T GC GAAACCT AT GC GT T C CAAGGGAT AT GT GACT AT GT T AGAC C CAT T T CAAC AG 360 



Qy 

Db 

Qy 

Db 



361 AT CT AT GGAAAAC GCAT G GGC GGACT CCT GT T TAT T CC T GCACT GAT GGGAGAAAT GTT C 42 0 

I I I I I I I I I I I I I I I II I I II II II II II I I I I I I I I II I I I II I I MINI 
361 AT CT AT G GAAAGCG C AT GGGT GGGCT GCT CT T CAT C C CT GCACT GAT G GGAGAGAT GTT C 42 0 

421 T GGGCT GCAGCAATTTT CT CT GCTTT GGGAGCCACCAT CAGC GT GAT CAT CGAT GT GGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

421 T GGG CT G CAGCAAT T T T CT CT GCAT T AG GGG C CAC CAT CAGC GT GAT C ATT GAT GT GGAT 480 



Qy 



481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 



Db 



II I I II II I I I I I I I I I I I I I I I I I I I I I I I I II II II M I I I i I Ml 

481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 540 



Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II II I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

Qy 661 CAT GC CAAAT ACCAAAAGC CGT GGCT GGGAACT GTTGACT CAT CT GAAGT CT ACT CTT GG 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 CAT GC T AAAT AC CAGAGT C C CT GGCT GGGAAC CATT GAAT CAGT T GAAGT CT ACAC CT GG 720 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 78 0 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

II I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I II I I I II I III 

Db 781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 84 0 

Qy 841 T GCCT GGT GAT GGC CAT CC CAGCCAT ACT CATT GGGGCCATTGGAGCAT CAACAGACT GG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 T GCCT GGT GAT GGCT CTACCCGCCAT AT GCATAGGAGCT ATT GGAGCTT CCACAGACT GG 900 

Qy 901 AACCAGACT GCAT AT GGGCTT CCAGAT C C CAAGACT ACAGAAGAG G CAGAC AT GAT T TT A 960' 

I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I II II I I I I II I I I I I I I 

Db 901 AACCAGACT GC CT AC GGGT AT C CAGAT C C CAAGAC T AAG GAGGAAG CAGAC AT GAT T CT C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 T CT GCT GCT GTT AT GT CAT CAGCAGATT CT TCCATCTT GT CAGCAAGTT CCAT GTTT GCA 108 0 

II I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I II I I I II I I I I I I I I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 C GGAACAT CT AC CAGCTTT CCT T CAGACAAAAT GCT T C GGACAAAGAAAT CGT TT G GGTT 1140 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
Db 10 81 C GGAAT AT CTAC CAGCTT T C CT T CAGACAAAAT GCAT C AGACAAG GAAATT GT GT GGGT C 1140 

Qy 1141 AT GC GAAT CACAGT GT TT GT GT TT GGAGC AT CT GCAAC AGCC AT GG CCT T GCT GAC GAAA 1200 

III I Mill III I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 AT GAGGAT CACT GTGCTTGTGTT C GGAGC AT CT GCAAC AGCC AT GGCTT T GCT GAC GAAG 1200 

Qy 12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I II I I I II I I I I I I I I I I I III 

Db 12 01 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 12 60 

Qy 12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II II I I I I I I II 
Db 1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 TCTGGCCTCTTCCT GAGAAT AACT GGAGGG GAGC C AT AT CT GT AT CTT C AG C C CT T GAT C 1380 

I I I I II I I I I I I I I I II I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I 



Db 



1321 TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1380 



Qy 1381 T T C T AC C CT G G CT AT TAC CCT GAT GAT AAT GGT AT ATATAAT CAGAAAT TT C CATTTAAA 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 TTCTACCCTGGTTATTACTCTGACAAGAATGGTATATACAATCAGAGGTTCCCATTTAAA 1440 

Qy 14 41 ACACT T GC C AT GGT TAC AT CAT T CT T AAC C AAC AT T T G CAT CT C CT AT CTAGC CAAGT AT 1500 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

Qy 1501 CT AT T T G AAAGT G GAAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT GCT GT T GT T GC AAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 
Db 1501 CT AT TT GAAAGT G GAAC CTT G CCT C CAAAAT TAGAT GT AT T T GAT GCT GTT GT C GCAAGG 1560 

Qy 1561 CACAGT GAAGAAAACAT GGAT AAGACAATT CTT GT CAAAAAT GAAAAT ATTAAATT AGAT 162 0 

I I I I I I I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1561 CACAGT GAAGAGAACAT GGACAAGACCATT CT AGT CAGAAAT GAAAAT AT CAAATT AAAT 162 0 

Qy 1621 GAACTT GCACTT GT GAAGCCACGACAGAGCAT GAC CCT CAGCT CAACTTT C ACCAATAAA 168 0 

I I II I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 GAACTTGCACCTGTGAAACCTCGGCAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAG 1680 

Qy 1681 GAGGC CT T C CT T GAT GT T GATT C C AGT C CAGAAG G GT CT GG GACT GAAGAT AAT T T ACAG 174 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAACTTACAA 174 0 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 12 

BD012720 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



BD012720 1743 bp DNA linear PAT 02-AUG-2002 

High-affinity choline transporter. 

BD012720 

BD012720.1 GI:22092909 
WO 0116315-A/4. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 1743) 
Haga,T. and Okuda,T. 
High-affinity choline transporter 
Patent: WO 0116315-A 4 08-MAR-2001; 

JAPAN SCIENCE AND TECHNOLOGY CORP,TATSUYA HAG A, TAKASHI OKUDA 
OS 
PN 
PD 
PF 
PR 



Mus musculus (mouse) 
WO 0116315-A/4 
08-MAR-2001 

18-AUG-2000 WO 2000JP005545 

27-AUG-1999 JP 99P 24 0642 , 27-DEC-1999 JP 99P 368991 PI 
TATSUYA HAGA, TAKASHI OKUDA 

PC C12N15/12, C07K14/47,C12Q1/68,C07K19/00,C07K16/18,C12N5/10, PC 
A61K38/17, 

PC A61K45/00,A61P25/28,G01N33/53, A01K67/027 
CC 



FH Key Location/Qualifiers 
FT CDS (1). .(1743). 

FEATURES Location/Qualifiers 
source 1. .1743 

/organism="Mus musculus" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 10090" 

ORIGIN 

Query Match 78.8%; Score 1373.4; DB 6; Length 1743; 

Best Local Similarity 86.7%; Pred. No. 0; 

Matches 1512; Conservative 0; Mismatches 231; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I II II II III 

Db 1 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 II I I I I I I I I I I I I I I I Ml 

Db 61 GT TGGAAT AT GG GCT GCAT GGAAAAC C AAAAAC AGC GGCAAC CC AGAAGAGCACAGT GAA 12 0 

Qy 121 GCCAT CATAGTT GGT GGCCGAGATATTGGTTT ATT GGTT GGT GGATTTACCAT GACAGCT 180 

I I I I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 180 

Qy 181 AC CT GGGT C GGAGGAGGGT AT AT CAAT GGCAC AGCT GAAGCAGTT T AT GT AC C AGGTT AT 240 

I M I I I I I II I I I I I I II I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I 

Db 181 ACCTGGGTTGGAGGAGGCTACATCAATGGGACAGCAGAAGCAGTGTATGGGCCAGGTTGT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I Mill I I I I I I I I I I I Ml 

Db 241 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II II I II I I I I II II I I I II I Mill II I I I I II I I II I I II I I I Mill II 
Db 301 TTTTTTGC GAAACCTAT GC GT T CCAAGGGAT AT GT GACT AT GT T AGAC C CAT T T CAACAG. 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I II II I I I II I I II I I II II II M II I M II I I I II M II II I I I II I I 

Db 361 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 42 0 

Qy 421 T GGGCT G C AG CAAT T TT CT CT GCTTT GGGAGC C AC CAT CAG C GT GAT CAT C GAT GT GGAT 48 0 

M II II II II I I II I I II I II M II II II I I I II I I II I II M I II I I I I I II I I I 
Db 421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 480 

Qy 4 81 AT GCACAT TT CT GT CAT C AT CT CTGCACT CAT T GC CACT CT GT ACACACT G GT GG GAG GG 54 0 

II MM II II I I I I I II I II I I II II I II I I III M II II Mill III 
Db 4 81 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I II I I II I II I II II I II I II II I II I I M M II II II I I II I II I I I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 



Qy 

Db 



601 
601 



660 
660 



Qy 661 CAT GC CAAAT AC CAAAAG C CGT GGCT G G GAACTGT T GAC T CAT CT GAAGT CT ACT CTT G G 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 661 CAT G CT AAAT ACCAGAGT CCCT GGCT GGGAAC CAT T GAAT CAGTT GAAGT CT AC ACCT G G 720 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I II 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 78 0 

Qy 7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I II I I II I I I I I I III 

Db 7 81 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 840 

Qy 841 T G C CT G GT GAT GGC CAT C C CAGC C AT ACT CAT T G G GGC CAT T GGAGC AT CAAC AGACT GG 900 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 TGCCTGGT GAT GGC T CT AC C C GC C AT AT G C AT AGG AG C TAT T GGAG CT T C C AC AGACT GG 900 

Qy 901 AAC CAGACTGCATATGGGCTT CCAGAT CCCAAGACTACAGAAGAGGCAGACATGATTTT A 960 

I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I 

Db 901 AAC CAGACTGC CTAC GGGTAT CCAGATCCCAAGACTAAGGAGGAAGCAGACATGATTCTC 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I I I I i I I II I I II II I I I I I I I I I I I I I I I I III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II II I I II I I I I I I I I I I I I II II I I I I I I I I I I II I I I I I I I I I I I I I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I II II I I I I I II I I I II 
Db 1081 CGGAATAT CTACCAGCTT TCCTTCAGACAAAATGCAT CAGACAAGGAAATT GT GT GGGT C 1140 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

III I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 ATGAGGATCACTGTGCTTGTGTTCGGAGCATCTGCAACAGCCATGGCTTTGCTGACGAAG. 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 1201 ACT GT GT AT GGGCT CT GGTAC CT GAGCT CT GACCT T GT CT ACAT CAT CAT CT T C C CAC AG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 CTGCTCTGTGTACTCTTCATCA7UVGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1321 T TT GGACT ATT CCT GAGAAT TACT GGAGGAGAGC CAT AT CT AT ACTT GCAG C CCT T AAT C 1380 

Qy 1381 T T CT AC C CT G G CT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT C AGAAAT T T C C AT TTAAA 144 0 

I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I II I I I I I I I I I I I 

Db 1381 TTCTACCCTGGTTATTACTCTGACAAGAATGGTATATACAATCAGAGGTTCCCATTTAAA 14 4 0 

Qy 1441 AC ACT T GC CAT GGT T AC AT CAT T CT T AAC CAAC AT T T GC AT CT C CT AT C T AGC C AAGT AT 1500 

II II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

Qy 1501 CT AT T T G AAAGT G GAAC CTT GC CAC CT AAAT T AGAT GT AT T T GAT GCTGTTGTT GCAAGA 1560 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1501 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATGTATTTGATGCTGTTGTCGCAAGG 1560 

Qy 1561 CACAGTGAAGAAAACAT GGATAAGACAATT CTT GTCAAAAATGAAAATATTAAATTAGAT 1620 

I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I II I I I I I I I I II 

Db 1561 CACAGT GAAGAGAACAT GGACAAGAC CATT CTAGT CAGAAAT GAAAAT AT CAAATTAAAT 1620 

Qy 1621 GAACTT GCACTT GT GAAGC CACGACAGAGCAT GACCCT C AGCT CAACTTT CACCAATAAA 168 0 

I I I I Mill II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 GAACTTGCACCTGTGA7VACCTCGGCAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAG 1680 

Qy 1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 
Db 1681 GAGGC C CT CC T T GAT GT T GAT T C CAGT CC GGAG GGGT CT GGGACT GAAGATAACT T ACAA 174 0 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 13 

AX080443 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



linear PAT 22-FEB-2001 



AX080443 4938 bp DNA 

Sequence 1 from Patent WO0078950. 
AX080443 

AX080443.1 GI:13159872 



Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Sierzega,M. and Albrandt,K. 

Differentially expressed genes in the adipocytes of obese mice 
Patent: WO 0078950-A 1 28-DEC-2000 ; 
AMYLIN PHARMACEUTICALS, INC. (US) 

Location/Qualifiers 

1. .4938 

/organism="Mus musculus" 
/mol__type="unassigned DNA" 
/db_xref="taxon: 10090" 
/note-"P4P6Bl" 



ORIGIN 



Query Match 78.8%; 
Best Local Similarity 86.7%; 
Matches 1512; Conservative 



Score 1373.4; 
Pred. No. 0; 
0; Mismatches 



DB 6; Length 4938; 

231; Indels 0; Gaps 



0; 



Qy 



Db 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I II I I I I I I I II I I I I I I I III I I I I I I I I I II II I III 

247 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATACTTCTG 306 



QY 



Db 



61 GT T G GAAT AT GGGCTGCCTG GAGAAC CAAAAAC AGT GGC AGC GCAGAAGAGC GCAGC GAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

307 GTTGGAATAT GGGCT GCAT GGAAAAC CAAAAACAGC GGCAACC CAGAAGAGCGCAGT GAA 366 



Qy 



121 GC CAT CAT AGT T GGT GGC C GAGAT AT T GGT TT AT T G GTT GGT GGAT T T AC CAT GACAGCT 180 



Db 



I I II I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

367 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 42 6 



Qy 181 AC CTGGGT C GGAGGAGGGTATAT CAAT GGCACAGCT GAAGCAGTTT ATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 427 AC CTGGGTT GGAGGAGGCTACAT CAAT GGGACAGCAGAAGCAGTGTATGGGCCAGGTTGT 486 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 487 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 546 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II 
Db 547 TTTTTTGC GAAAC C TAT G C GT T C C AAGG GAT AT GT GAC T AT GT TAG AC C CAT T T C AAC AG 606 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

I I I I I II I I I I I I I I I I I I II II II II II I I I II I I I I I I I II I I I I I I I I I 

Db 607 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 666 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I II II I II I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 667 T G GGCT GC AGCAATT T T CT CT GCAT TAG G GGC C AC CAT C AGC GT GAT CAT T GAT GT GGAT 72 6 

Qy 481 AT GCAC AT T T CT GT CAT CAT CT CT GC ACT CATT GC CACT CT GT ACACACT GGT GGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 727 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 786 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I I I I I I II Mill II MINIMI I I II I I I I I I 
Db 7 87 CT CTACT CT GT G GCAT AT AC T GATGT T GT C CAG CT AT T CT GCATT TT T AT AGGACT GT G G 84 6 

Qy 601 AT CAG C GT C CC CT T T GCAT T GT C ACAT C CT GC AGT C GC AGAC AT C GGGT T C AC T GCT GT G 660 

M II I I II I I I I I I I I I I II I I I I I I I I I I I I I I II II II I I I II I II I I I I 
Db 847 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 906 

Qy 661 CATGCCAAATACC7W\AGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

I I I I I I II I I I M I I I I II I I I I I I I I II I I I II II II I II II I I II I 

Db 907 CATGCTAAATAC CAGAGT CCCT GGCT GGGAAC CATT GAATCAGTT GAAGT CTACACCTGG 966 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I II I I I I I II I I I I I I II I II I I II I M I I II II II II I I I M I II II I II I I I 

Db 967 CT T GAT AAT TT T CT GTT AT T GAT GCT GGGT GGAAT C CC AT GGCAAG CCT ACT T CCAGAG G 1026 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

II I I I I I II I I I I I I II II I I I I II I I I I I II II I II I II II II I I I I II I I M I 

Db 1027 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 1086 

Qy 841 T GCCT GGT GAT GGC CAT CCCAGCCAT ACT CATT GGGGCCATTGGAGCAT CAACAGACT GG 900 

I M II I I I II II I I I I I II II I I I II I I I I II M I I II I I I I I I I I I II 

Db 1087 T GCCT GGT GAT GGCT CTACCCGCCATATGCATAGGAGCTATT GGAGCTT CCACAGACTGG 114 6 

Qy 901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

I II I I I I I I I I II III M II II I I I I I I I I I I I II II I II I I I I I I I I I I 

Db 114 7 AACC AGACT GC CT AC GGGT AT C CAGAT C C CAAGACT AAG GAGGAAGCAGACAT GAT T CT C 12 06 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II M I I I II I II I I I I I I I II I II I M II II II II II I I II I II II I I I III 



Db 1207 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1266 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I M I I M I I I I I I I I I 

Db 1267 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1326 

Qy 1081 C G G AAC AT C T AC C AG CTTTCCTT C AGAC AAAAT G C T T C G G AC AAAG AAAT CGTTTGGGTT 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I II I I I I I 
Db 1327 CG GAAT AT CT AC CAGCT TT C CTT CAGACAAAAT GCAT CAGACAAGGAAAT T GT GT GGGT C 1386 

Qy 1141 AT G C GAAT C ACAGT GT T T GT GT T T G GAG CAT CT GC AAC AGC CAT GGCCTTGCT GAC G AAA 12 00 

III I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 1387 AT GAGGATCACT GTGCTTGT GTTCGGAGCAT CT GCAACAGCCAT GGCTTT GCT GACGAAG 144 6 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I II I I I I I II I II I I I I I I I I I I I I II I I I I I I I I I III 
Db 1447 ACT GT GT AT GGGCT CT GGT ACCT GAGCT CT GACCTT GT CTACAT CAT CAT CTT CCCACAG 1506 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I II I I I I I I II I I I II I I I I II I I I I I I I I II II I I I I I I II 

Db 1507 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1566 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III 

Db 1567 TT T GGACT AT T C CT GAGAAT T ACT GGAGGAGAG C CAT AT CTAT ACT T GCAGCC CTTAAT C 1626 

Qy 1381 T T C T AC C C T G GC T AT T AC C CT GAT G AT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1627 TT CTACCCT GGTTATTACT CTGACAAGAAT GGTATATACAATCAGAGGTT CCCATTTAAA 1686 

Qy 1441 AC ACT T GC C AT G GT T AC AT CAT T CT T AAC C AAC AT T T G CAT CT C CT AT CT AGC C AAGT AT 1500 

I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I 
Db 1687 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 174 6 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1747 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATGTATTTGATGCTGTTGTCGCAAGG 1806 

Qy 1561 C AC AGT GAAGAAAAC AT GGAT AAGAC AATT CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I II 
Db 1807 C AC AGT GAAGAGAAC AT G GAC AAGAC CAT T C T AGT C AG AAAT GAAAAT AT C AAAT T AAAT 1866 

Qy 1621 GAACTT GC ACT T GT GAAGC C AC GAC AGAGC AT GAC C CT C AG CT CAACT T T C AC CAAT AAA 168 0 

I I I I I II I I I I I I I I I II II I I I I II I I I I I I I II I I I I I II I I I I I I I I I I 
Db 1867 GAACT T GC ACCT GT GAAAC CTC GGC AGAGC CT AAC C CT C AGT T CAACTTT CAC CAAT AAG 1926 

Qy 1681 G AGG CCTTCCTT GAT GT T GAT T C C AGT C C AGAAGGGT CT G GGACT GAAGAT AAT T T AC AG 17 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1927 GAGGCCCT CCTT GAT GTT GATT C CAGTCCGGAGGGGTCT GGGACTGAAGATAACTTACAA 198 6 

Qy 1741 TGA 1743 

I I I 

Db 1987 TGA 1989 



RESULT 14 
MMU401467 



LOCUS MMU401467 1743 bp mRNA linear ROD 16-AUG-2000 

DEFINITION Mus musculus mRNA for high affinity choline transporter (CHT1 

gene) . 
ACCESSION AJ401467 
VERSION AJ401467.1 GI: 9843808 

KEYWORDS ChTl gene; high affinity choline transporter. 
SOURCE Mus musculus (house mouse) 

ORGANISM Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 

AUTHORS Wieland,A., Bonisch,H. and Bruss,M. 

TITLE Molecular cloning of the human and murine high affinity choline 

transporters and characterization ofthe human gene-structure 
JOURNAL Unpublished 
REFERENCE 2 (bases 1 to 1743) 
AUTHORS Bruess,M. 
TITLE Direct Submission 

JOURNAL Submitted ( 14-AUG-2000 ) Bruess M. , University of Bonn, Pharmacology 
and Toxicology, Reuter str. 2b, D-53113 Bonn, GERMANY 
FEATURES Location/Qualifiers 
source 1. .1743 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="BALB/cJ" 
/db_xref="taxon: 10090" 
/ tissue_type="brainstem" 
gene 1. .1743 

/gene="CHTl" 
CDS 1. .1743 

/gene="CHTl" 

/function="sodium- and chloride-dependent reuptake of 

choline" 

/ codon_start-l 

/evidence=experimental 

/product="high affinity choline transporter" 

/protein_id="CAC03719.1" 

/db_xref="GI : 9843809" 

/db_xref="GOA:Q9ESW5" 

/db_xref="SPTREMBL: Q9ESW5" 

/ translation= "MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEEHSEAIIV 
GG RD I G L L VG G FTMT AT WVG G G Y I N GT AVAVY G P GCG LAW AQ AP I GYS LS LI LGGLFF 
AKPMRSKGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVD 
WISVIVSALIAILYTLVGGLYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFT 
AVHAKYQS PWLGT I ESVEVYTWLDNFLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS YL 
AAFGCLVMALPAICIGAIGASTDWNQTAYGYPDPKTKEEADMILPIVLQYLCPVYISF 
FGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWMRITVLVFGASA 
TAMALLTKTVTGLWYLSSDLVYIIIFPQLLCVLFIKGTNTYGAVAGYIFGLFLRITGG 
EPYLYLQPLIFYPGYYSDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKYLFESGTLP 
PKLDVFDAWARHSEENMDKTILVRNENIKLNELAPVKPRQSLTLSSTFTNKEALLDV 
DSSPEGSGTEDNLQ" 

ORIGIN 

Query Match 78.4%; Score 1367; DB 10; 

Best Local Similarity 86.5%; Pred. No. 0; 
Matches 1508; Conservative 0; Mismatches 235; 



Length 1743; 

Indels 0; Gaps 0; 



1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

II I I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I II II II Ml 

1 ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

61 GTT GGAATAT GGGCTGC CT GGAGAACCAAAAACAGT GGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I III II I 

61 GTT GGAATAT GGGCT GCAT GGAAAAC CAAAAACAGC GGCAACCCAGAAGAGCACAGT GAA 12 0 

121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 18 0 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 

181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

MINIM I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 ACCTGGGTTGGAGGAGGCTACATCAATGGGACAGCAGTAGCAGTGTATGGGCCAGGTTGT 24 0 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III 

241 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

301 T T C T T T G C AAAAC C TAT G C GT T C AAAG G G GT AT GT G AC CAT GT TAG AC C C GT T T C AG C AA 360 

II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
301 TTTTTTGC G AAAC C TAT G C GT T C C AAG G GAT AT GT G AC TAT GT TAG AC C CAT T T C AAC AG 360 

361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

I I I I I I I I I I I II I I I I I I II II II II II I i I I I I I I I I I I I I I I I I I I I I I 

361 AT C TAT G GAAAGC G CAT GGGT GGGCT GCTCTT CAT C C CT GCACT GAT GGGAGAGATGT T C 420 

421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

. I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 T GG G CT GCAGCAAT T T T CT C T GCAT T AGGG GC C AC CAT C AGC GT GAT CAT T GAT GT G GAT 480 

481 AT GCACATTT CT GT CAT CAT CT CT GCACT CATT GCCACT CT GTACACACT GGT GGGAGGG 540 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I III II II II Mill III 

481 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 54 0 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I I I I 

541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

601 AT C AGT GTCCCTTTTGCCCTGT C AC AT C C T G C AGT C AC C G AC AT C G GAT T C AC AG C T GT G 660 

661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

I I I I I I I I I I I I I I II I II I I I II I I I I I I I III I I I I I I I I I I I III 

661 CAT GCT AAAT AC CAGAGT C C CT GGCT GG GAAC CATT GAAT CAGT T GAAGT CT ACACCT GG 720 

721 CT T GAT AGT T T T CT GT T GTT GAT GCT GGGT GGAAT C C CAT GG C AAGC AT ACT T T CAG AGG 780 

I I I I I I I MINIMI I I I I I M I I I II I I I I I I I I I II I I I I I I I I I I I MIMI 

721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 78 0 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II II I I I I I I I I I I I M I I I I I I I I I M I I II I I I I II I I I I I I I I I I I I I III 

781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTACCTGGCAGCTTTTGGG 84 0 

841 TGCCT GGT GATGGCCATC CCAGC CAT ACT CATT GGGGCCATT GGAGCAT CAACAGACT GG 900 



Db 



I I I I I I I I I I I I I I I II I I I I I I III II I I I I I I I I I I M I I I I I I I I I 

841 TGCCTGGTGATGGCTCTACCCGCCATATGCATAGGAGCTATTGGAGCTTCCACAGACTGG 900 



Qy 901 AACCAGACTGCATAT GGGCTTCCAGAT CCCAAGACTACAGAAGAGGCAGACAT GAT TTT A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 901 AACCAGACT GCCTACGGGTATCCAGAT CCCAAGACTAAGGAGGAAGCAGACAT GAT TCT C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I II I I I I I I I I M I II 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II II II I I I I I 
Db 1081 C GGAAT AT CT AC C AG CTT T CCT T CAGACAAAAT GCAT CAGACAAG GAAAT T GT GT G GGT C 1140 

Qy 1141 AT GC GAAT CAC AGT GT TT GT GT T T GGAGCAT CT GCAAC AG CCAT G GCCT T GCT GAC GAAA 1200 

III I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 AT GAGGAT CACT GTGCTT GTGTT CGGAGCATCT GCAACAGCCATGGCTTT GCT GAC GAAG 12 00 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 1201 ACT GT GTAT GGGCTGT GGTACCTGAGCT CT GACCTT GT CT ACAT CAT CAT CTT CCCACAG 12 60 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

II I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II II I I I I I I II 

Db 1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 T CTGGCCT CTT C CTGAGAATAACTGGAGGGGAGCCATAT CTGTAT CTTCAGCCCTT GAT C 1380 

I I II II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I II I I I I I III 
Db 1321 T T T GGACT ATT C CT GAGAATTACTGGAGGAGAGCC ATAT CT AT AC T T GC AGC C CTT AAT C 1380 

Qy 1381 T T C T AC C C T GG C TAT T AC CCT GAT GAT AAT G GTAT AT AT AAT C AGAAAT T T C CAT T T AAA 144 0 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 1381 T T CT ACC CT GGTTATT ACT CT GACAAGAAT GGT AT AT ACAAT CAGAGGT T C CCAT T TAAA 1440 

Qy 1441 AC ACT T GC CAT GGTT AC AT CAT T CT TAACC AAC AT T T G C AT CT C CT AT CT AGC CAAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ACT CT CT C CAT GGT T ACCT CAT T CT T T AC CAAC AT T T GT GTT T CT TAT CT AGC CAAGTAT 1500 

Qy 1501 CTAT T T GAAAGT GGAAC CT T GC CAC CTAAATT AGAT GT ATTT GAT GCT GTT GT T GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1501 C T AT T T GAAAGT GGAAC C T T G C CT C C AAAAT TAG AT GTAT T T GAT GCTGTTGTCG C AAG G 1560 

Qy 1561 CAC AGT GAAGAAAACAT GGAT AAGACAAT T CT T GT CAAAAAT GAAAAT AT TAAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1561 CACAGT GAAGAGAACAT G GACAAGAC CAT T CT AGT CAGAAAT GAAAAT AT CAAAT TAAAT 1620 

Qy 1621 GAACT T GCACTT GT GAAGCCACGACAGAGCAT GACCCT CAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I I I MINI II II I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I 
Db 1621 GAACTT GC AC CT GT GAAAC CT C GGCAGAGCCT AAC CCT CAGT T CAACT T TCAC CAATAAG 1680 

Qy 1681 GAGGC CTT CCT T GAT GTT GATT CCAGT CCAGAAGGGT CT GGGACT GAAGAT AATTTACAG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAACTTACAA 17 40 



Qy 1741 TGA 1743 

III 

Db 1741 TGA 1743 
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TMA420808 2528 bp mRNA linear VRT 27-NOV-2001 

Torpedo marmorata mRNA for high affinity choline transporter (CHT1 
gene) . 
AJ420808 

AJ42 08 08. 1 GI: 17148508 

CHT1 gene; high affinity choline transporter. 
Torpedo marmorata (marbled electric ray) 
Torpedo marmorata 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Chondrichthyes ; 
Elasmobranchii; Squalea; Hypnosqualea; Pristiora j ea; Batoidea; 
Torpedini formes ; Torpedinoidei ; Torpedinidae ; Torpedo . 
1 

Guermonprez, L . , O' Regan, S., Meunier,F.M. and 
Morot-Gaudry-Talarmain, Y. 

Cyclosporin, FK506 and rapamycin inhibit neuronal choline uptake 

via calcineurin-dependent and independent mechanisms 

Unpublished 

2 (bases 1 to 2528) 

O 1 Regan, S. 

Direct Submission 

Submitted (21-NOV-2001 ) O'Regan S. 



gene 
CDS 



Neurobiologie Cellulaire et 



Moleculaire, C.N.R.S., 1 av de la Terrasse, F-91198 Gif-sur-Yvette, 
FRANCE 

Location/Qualifiers 
1. .2528 

/organism="Torpedo marmorata" 
/mol_type="mRNA" 
/db_xref="taxon:7788" 
/clone="tH312" 

/tissue_type="electric lobe" 

/tissue_lib="lambda ZAPII ELL" 

1. .2528 

/gene="CHTl" 

49. .1803 

/gene="CHTl" 

/function="neuronal Na-dependent choline transporter" 

/ codon_start-l 

/ evidence=experimental 

/product="high affinity choline transporter" 

/protein_id="CAD12727 . 1" 

/db_xref="GI: 17148509" 

/db_xref ="GOA: Q8UWF0 " 

/ db_xref ="SPTREMBL : Q8UWF0" 

/translation="MTVHIDGIVAIVLFYLLILFVGLWAAWKSKNTSMEGAMDRSEAI 
MI GGRD I GLLVGGFTMTAT WVGGGYI NGTAEAVYVPGYGLAWAQAP FGYALS LVI GGL 
FFAKPMRSRGYVTMLDPFQQMYGKRMGGLLFIPALLGEIFWSAAILSALGATLSVIVD 
ININVSVWSAVIAVLYTLVGGLYSVAYTDWQLFCIFLGLWISIPFALLNPAVTDII 
VTANQEVYQEPWVGNIQSKDSLIWIDNFLLLMLGGIPWQVYFQRVLSASSATYAQVLS 



FLAAFGCVLMAIPSVLIGAI GTSTDWNQTSYGLPGPIGKNETDMILPIVLQHLCPPYI 
S FFGLGAVS AAVMS S ADS S I LS AS SMFARN I YHLAFRQEAS DKEI VWVMRI T I FLFGG 
AAT SMALLAQS I YGLW YLS S DLVYVI I FPQLI SVLFVKGTNT YGS IAGYI I GFLLRI S 
GGEPYLHMQPFIYYPGCYLDHSFGDDPVYVQRFPFKTMAMLFSFLGNTGVSYLVKYLF 
VSGILPPKLDFLDSWS KHSKEIMDKTFLMNQDNITLSELVHVNPIHSASVSAALTNK 
EAFEDIEPNPELSKSGND" 
polyA_signal 2487. .2492 
/gene="CHTl" 

ORIGIN 

Query Match 49.7%; Score 867; DB 5; Length 2528; 

Best Local Similarity 69.3%; Pred. No. 2.9e-227; 

Matches 1217; Conservative 0; Mismatches 520; Indels 18; Gaps 2; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I III I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 49 AT GAC C GTT CAC AT C GAT GGGAT CGT AGC GAT C GT C CT GT TT T ACT T GT TAAT CTT ATTT 108 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCA GCGCAGAAGAGCGC 114 

I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I MM 

Db 109 GT T GGATT AT GG G CT G C TT GGAAAAGTAAAAACACGT C AATGGAAG GAGCAAT GGAT CGG 168 

Qy 115 AGCGAAGC C AT C AT AGT T G GT G GC C GAGAT ATT GGT T TAT T GGT T GGT GGATTTAC CAT G 174 

I I I M M I I II I I I I II I II I I I I II I I II I I II II I II I I M I 

Db 169 AGTGAAGCTATAATGATTGGGGGAAGAGATATCGGGCTGCTGGTTGGTGGCTTCACAATG 228 

Qy 175 AC AGCT ACCT GGGT C GGAGGAGGGT AT AT CAAT GGCACAGCT GAAGCAGTTTAT GTACCA 234 

II II II M I II I I I M II I I I I I II I I I I I II I I II II I II II II I II 

Db 229 ACCGCAACTTGGGTCGGTGGCGGTTATATCAATGGGACAGCAGAGGCGGTTTATGTTCCT 288 

Qy 235 GGTTATGGCCTAGCTTGGGCTCAGGCACGAATTGGATATTCTCTTAGTCTGATTTTAGGT 294 

II I I M I I II I II I I I I I II II I I I I II I M II I I I I I I I I 

Db 289 GGGTACGGCTTGGCCTGGGCGCAGGCTCCCTTCGGATACGCACTCAGCCTGGTTATTGGC 348 

Qy 295 GGCCTGTTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTT 354 

I I I I I I I I I I I I I II II I I I I II I II I I II I I I II II I I I II I II I I 

Db 349 GGCTTATTTTTCGCTAAACCCATGCGCTCACGGGGTTACGTGACCATGCTGGACCCGTTT 408 

Qy 355 CAGCAAATCTATGGAA7\ACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAA 414 

I I I I I I I I I I I I I M I I I II I I I I I I M I I II II I I I II I II I 

Db 4 09 CAAC AGAT GT AC GGTAAAC GAAT GG GAGGATT GCT CTT CAT CCCCGCTCTCCT GGGGGAA 468 

Qy 415 ATGTTCTGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGAT 474 

I I II I I I I I I M I I I I I I I I I I I II I I M I I II I II II III 

Db 469 ATCTTCTGGTCTGCAGCCATACTGTCCGCGCTAGGTGCAACTTTAAGCGTGATTGTGGAC 528 

Qy 475 GT GGAT AT GCACATTT CT GTCAT CATCT CT GCACTCATT GCCACT CTGT ACACACTGGT G 534 

I MM I I I M II I I I I I I I I I I I I II I I I I I I II 

Db 52 9 AT CAATATAAAC GT AT CAGT G GT AGT TT C CG CT GT GAT C GCT GT AT TAT AC ACT CT GGT C 588 

Qy 535 GGAGGGCTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGG 594 

II III I II II II II I I I I I I I I I II I I II I I I II II I I I II I I I I 

Db 58 9 GGCGGGTTATACTCGGTCGCGTACACAGATGTCGTCCAGTTGTTTTGCATCTTCTTAGGT 64 8 



Qy 

Db 



595 
649 



654 
708 



655 GCTGT GCAT GCCAAATACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CTAC 714 

II III I I I I I I I I I I I I I I I I I I I I I I I I I I 

7 09 GCAAAT CAAGAAGTTTAT CAGGAGCCTTGGGT GGGAAATATACAAT CAAAGGACAGTTTA 768 

715 TCTTGGCTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTT 774 

III I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III III 
769 AT CT G GAT T GAC AACT TT CT AT TACT GAT GCT GGGT GGAAT C C C GT GGCAAGT AT AT TTT 82 8 

775 CAGAGGGTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCT 834 

I I I I I II II III I I I I II II I I I I I I I I I I I I I I I I I I I I I II I I I II 

82 9 CAGAGAGTCCTTTCTGCTTCTTCTGCTACCTATGCGCAAGTCCTGTCCTTTCTGGCTGCC 88 8 

835 TTCGGGTGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACA 894 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

88 9 TTCGGTTGCGTTCTCATGGCCATCCCGTCTGTTCTCATCGGTGCAATAGGAACATCCACT 94 8 

895 GACT GGAAC CAGACT GCAT AT GGGCT T CC AGAT C C CAAGACT ACAGAAGAGG CAGACAT G 954 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I 

949 GACT GGAAT CAGACT T C CT AT GGCTT GC C AGGC C CT AT AGGC AAAAAT GAGACT GATAT G 1008 

955 ATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGT 1014 

I I I I I II II II I I I I I I I I I I II II II I I I I I II I I I I I I I I I I I 

1009 ATTTTGCCGATCGTGCTGCAGCATCTGTGTCCACCCTACATTTCCTTTTTTGGTCTTGGC 10 68 

1015 GCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATG 1074 

II II I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I III 

1069 GCTGTCTCTGCTGCTGTGATGTCATCGGCTGATTCTTCTATCTTATCAGCAAGTTCTATG 1128 

1075 T T T GC AC G GAAC AT CT AC CAG CT T T C CT T CAGAC AAAAT G CT T C GGAC AAAGAAAT C GT T 1134 

I I I II I I I I I II I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

1129 TTTGCTCGGAATATTTACCATCTTGCTTTCAGACAAGAGGCTTCAGACAAAGAAATAGTG 1188 

1135 TGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTG 1194 

I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1189 T GGGT AAT GCGAAT C AC CAT AT TT CTAT T T GGAG GAGCT GC AACAT CTAT GG CATT GCTT 1248 

1195 ACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTC 1254 

I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1249 GCTCAATCAATCTATGGCCTCTGGTATCTGAGCTCAGATCTTGTCTACGTCATTATCTTT 1308 

1255 CCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGT 1314 

I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1309 C CT CAAT T AAT AT C AGT GCTCTTCGT CAAGG GAACAAACAC AT AT GGGT CTAT T GCT GGA 1368 

1315 TATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCC 1374 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

1369 TAT AT CAT TGGCTTTTTGCTTC GGAT T AGT GGT GGT GAAC CAT AT T TACAT AT GC AGC C A 1428 

1375 TTGATCTTCTACCCTGGCT ATT ACCCT GAT GAT AAT GGT AT AT AT AAT 1422 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

1429 T T TAT T TAT T AC C C T G GAT G C TAT T TAG AT CAT T C C T T T G GAG AT GAT C C T G T T TAT GTT 1488 

1423 CAGAAATTTCCATTTAAAACACTTGCCAT GGTTACAT CATT CTTAACCAACATTT GCAT C 14 82 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

14 89 CAGAGATTTCCCTTTAAAACCATGGCAATGTTATTCTCCTTCTTGGGCAACACTGGTGTA 1548 



Qy 1483 T C CT AT CTAGC CAAGT AT C T AT TT GAAAGT GGAAC CT T GC C AC CT AAAT T AGAT GTAT T T 1542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 154 9 TCATATCTTGTCAAGTACCTGTTCGTAAGTGGAATATTGCCACCAAAATTAGACTTCCTT 1608 



Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 



1543 GAT GCT GTTGTT GCAAGACACAGT GAAGAAAACAT GGAT AAGACAATT CTT GT CAAAAAT 1602 

II Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 

1609 GACAG C GTT GT AT CAAAACAC AGT AAG GAAAT CAT GGACAAAAC AT T C T T GAT GAAT C AG 1668 

1603 GAAAAT ATTAAAT T AGAT GAACTT GCACTT GT GAAGC CACGACAGAGCAT GACCCT C AGC 1662 

I I I I I I I I I II I I II I I I I I I I I I I I I I I I I III 

1669 GAC AAT AT T AC T T T GT C AG AG C T G GT G CAT GT T AAT C C AAT AC ACAGT G CT T C AGT T AGT 1728 

1663 TCAACTTTCACCAATAAAGAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGG 1722 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I III I 

172 9 GCTGCTTT GAC CAATAAGGAAGCATTTGAAGACAT TGAGCCAAAT C CTGAACTTT CTAAG 1788 

172 3 ACTGAAGATAATTTA 17 37 

II I I I I I I 

178 9 T CAGGC AAT GAT T GA 18 03 
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ALIGNMENTS 



RESULT 1 
AAF81712 

ID AAF81712 standard; cDNA; 1743 BP. 
XX 

AC AAF81712; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Human high affinity choline transporter protein encoding cDNA. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer f s disease; diagnosis; 

KW ss. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT CDS 1. .1743 

FT /*tag= a 

FT /product^ "high affinity choline transporter" 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR P-PSDB; AAB74665. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 9; Page 71-75; 90pp; Japanese. 
XX 

CC The present sequence encodes a human (Homo sapiens) high affinity choline 

CC transporter protein designated cho-1. The cho-1 protein has nootropic and 

CC neuroprotective activities. The cho-1 polynucleotide and protein can be 

CC used for the diagnosis of diseases related to the expression of cho-1 by 

CC comparing the cho-1 polynucleotide sequence in a sample to that of a 

CC control. Drug compositions containing the cho-1 protein or expression 

CC promoters or inhibitors of cho-1 are useful for treating disorders 

CC characterised by abnormal levels of cho-1, such as Alzheimer's disease 
XX 

SQ Sequence 1743 BP; 412 A; 393 C; 406 G; 532 T; 0 U; 0 Other; 

Query Match 100.0%; Score 1743; DB 4; Length 1743; 
Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 AT GGCT TT C CAT GT GGAAGGACT GAT AGCT AT CAT C GT GT T CT ACCT T CTAATT T T G CT G 60 

Qy 61 GT T G GAAT AT GGGCTGCCTG GAGAAC C AAAAAC AGT G G C AGC GC AGAAGAG C GC AGC GAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GTT GGAATAT GGGCTGCCT GGAGAACCAAAAACAGT GGCAGCGCAGAAGAGCGCAGCGAA 120 

Qy 121 GC CAT CAT AGT T G GT G GC C GAGAT AT T GGT T TAT TGGTTGGTG GAT T T AC CAT GAC AG C T 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 



Qy 



181 AC CTG GGT C GGAGGAG G GT AT AT CAAT GG CACAG CT GAAGCAGT TT AT GT AC C AGGT TAT 240 
I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



181 ACCTGGGT CGGAGGAGGGTATAT CAAT GGCACAGCT GAAGCAGTTTATGT ACCAGGTTAT 240 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 T T CT T T G C AAAAC CT AT G C GT T C AAAGG G GT AT GT GAC CAT GT T AGAC C C GT T T C AG C AA 360 

361 AT CTAT GGAAAACGCATGGGCGGACT CCT GTTTATT CCT GCACT GAT GGGAGAAAT GTT C 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 AT CTAT GGAAAACGCATGGGCGGACT CCT GTTTATT CCT GCACT GAT GGGAGAAAT GTTC 42 0 

421 T GGGCT GCAGCAAT T TT CT CT GCTTT GGGAGCCACCATC AGC GT GATCAT CGAT GT GGAT 480 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

421 T GGGCT GCAGCAATT T T C T CT GCTT T GGGAGC C AC CAT CAGC GT GAT CAT CGATGT GGAT 48 0 

481 AT GCAC AT T TCT GT CAT CAT CT CT GCACT CAT T GC CACT CT GT AC ACACT GGT GGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 AT GCACATTTCT GT CAT CAT CT CT GCACT CATT GCCACT CT GT AC ACACT GGT GGGAGGG 540 

54 1 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG' 600 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

661 CATGC CAAAT ACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CTT GG 72 0 

I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

721 CTT GAT AGTT T TCT GT T GT TGAT GCT GGGT G GAAT C CC AT GGCAAGCAT ACT T T C AGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

841 TGCCTGGT GAT GG CC AT C C CAGC CAT ACT CAT T GG GGC C AT T GGAG CAT CAACAGACT GG 900 

I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 
841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

901 AACCAGACTGCATAT GGGCTT CCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 AAC CAGACT GC AT AT GGG CTT C CAGAT C C CAAGAC T AC AGAAGAG GCAGAC AT GAT TT T A 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 



Qy 

Db 



1081 
1081 



1140 
1140 



Qy 1141 AT G C GAAT CACAGT GTT T GT GTTT GGAG CAT C T GCAAC AGC C AT GGCCTTGCT GAC GAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I II I I I I I I I I I II I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

Qy 1381 T T C T AC C CT GGCT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAAT TT C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 TTCTAC CCTGGCT ATTACCCT GAT GAT AAT GGT ATATATAATCAGAAATTT CCATTTAAA 144 0 

Qy 1441 ACACT T GC CAT G GT T ACAT CAT T CT T AAC C AAC AT T T G CAT CT C CT AT CT AGC C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ACACTT GCCATGGTTACATCATT CTTAACCAACATTT GCAT CTCCT AT CT AGCCAAGT AT 1500 

Qy 1501 CT AT T T GAAAGT GGAAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT G CT GT T GTT G CAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I 
Db 1501 CT AT TT GAAAGT GGAAC CTT GCCAC CT AAAT T AGAT GT AT T T GAT G CT GTT GT T GCAAGA 1560 

Qy 1561 CACAGT GAAGAAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 CACAGT GAAGAAAACAT GGATAAGACAATT CTT GT CAAAAAT GAAAAT ATT AAATT AGAT 1620 

Qy 1621 GAACTT GCACTT GT GAAGCCACGACAGAGCAT GACCCT CAGCT CAACTTT CACCAATAAA 1680 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 GAACTT GCACTT GT GAAGCCACGACAGAGCATGACCCT CAGCT CAACTTTCACCAATAAA 1680 

Qy 1681 GAGGC CT T C CT T GAT GT T GATT C C AGT C C AGAAG GGT CT GG GACT GAAGAT AAT T T AC AG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 1740 

Qy 1741 TGA 1743 

III 

Db 1741 TGA 1743 



RESULT 2 
AAH49207 

ID AAH49207 standard; cDNA; 1743 BP. 
XX 

AC AAH49207; 
XX 

DT 26-NOV-2001 (first entry) 



XX 

DE Human CHOT encoding cDNA. 
XX 

KW CHOT; human; choline transporter; chromosome 2qll-13; nootropic; 

KW neuroprotective; gene therapy; antisense therapy; degenerative disease; 

KW cognitive disorder; Alzheimer's disease; ss. 

XX 

OS Homo sapiens. 
XX 

PN DE10009055-A1. 
XX 

PD 30-AUG-2001. 
XX 

PF 28-FEB-2000; 2 000DE-01009055 . 
XX 

PR 28-FEB-2000; 2000DE-01009055 . 
XX 

PA (BRUE/) BRUESS M. 

PA (BOEN/) BOENISCH H. 
XX 

PI Bruess M, Boenisch H; 
XX 

DR WPI; 2001-590709/67. 

DR P-PSDB; AAB86837. 
XX 

PT A new gene encoding human choline transporter,, designated hCHOT is 

PT located on chromosome 2qll-13 and is useful to treat degenerative 

PT disorders such as Alzheimer's disease. 
XX 

PS Disclosure; Page 11; 12pp; German. 
XX 

CC This invention describes a novel gene encoding human choline transporter, 

CC designated hCHOT which is located on chromosome 2qll-13. The products of 

CC the invention have nootropic and neuroprotective activity and can be used 

CC for gene or antisense therapy. (I) is used to treat degenerative disease, 

CC particularly cognitive disorders such as Alzheimer f s disease. Sense and 

CC antisense oligonucleotides derived from the gene may be used in 

CC diagnostics and other techniques. This sequence encodes the human CHOT 

CC protein described in the invention 
XX 

SQ Sequence 1743 BP; 412 A; 393 C; 406 G; 532 T; 0 U; 0 Other; 

Query Match 100.0%; Score 1743; DB 5; Length 1743; 
Best Local Similarity 100.0%; Pred. No. 0; 

Matches 17 43; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GTT G GAAT AT G GGCT GC CT GGAGAACCAAAAACAGT GG CAGC GCAGAAGAGC GC AGCGAA 120 

Qy 121 G C CAT C AT AGT TGGTGGCC GAG AT AT T G GT T TAT TGGTTGGT GG AT T T AC CAT G AC AG C T 180 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTT GGT GGCCGAGAT ATT GGTTT ATT GGTT GGT GGATTT ACCAT GACAGCT 180 



Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 AC CT G GGT C GGAG GAG G GT AT AT CAAT GGCACAG CT GAAGCAGT TT ATGT ACC AGGT TAT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Qy 301 T T CT T T G CAAAAC CT AT GC GT T C AAAGGGGT AT GT GAC CAT GT T AGAC C C GT T T C AG C AA 360 

I I I I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

Qy 361 AT CT AT G GAAAAC G CAT GGGC GGACT C CT GT T TAT T CCT GCACT GAT GGGAGAAAT GT T C 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

Qy 421 T G GGCT G C AGCAAT TTTCTCTGCTTTG GGAGC C ACCAT C AG C GT GAT CAT C GAT GT GG AT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 48 0 

Qy 4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I 

Db 481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 AT C AG C GT C C C CT T T GC AT T GT C ACAT CCT GC AGT C GC AGAC AT C GGGT T C ACT GCT GT G 660 

1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I i I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 CAT GC C AAAT AC C AAAAG CCGTGGCTGG GAACT GT T GAC T C AT CT G AAGT CT ACT CT T GG 72 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

Qy 721 CT T GAT AGT TTTCTGTTGTT GAT GCT GGGT GGAAT C C CAT GGC AAGC AT ACT T T C AGAG G 78 0 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTC7^AGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

Qy 841 T G CCT GGT GAT GGCCAT C C CAG C CAT ACT CATT GGG GCC ATT GGAGCAT CAACAGACT GG 900 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 841 TGCCTGGT GAT GGC CAT C C CAG C CAT AC T CAT T G G G G C CAT T G GAG CAT C AAC AG AC T G G 900 

Qy 901 AACCAGACT GCAT AT GGGCTT CCAGAT CCCAAGACT AC AGAAGAGGCAGACAT GATTTTA 960 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 



Qy 

Db 



1021 
1021 



TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

TCTGCTGCT GTT AT GT CAT CAGCAGAT T CT T C CAT C T T GT CAGCAAGT T C CAT GTTT GCA 



1080 
1080 



Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 

I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

Qy 1141 AT G C GAAT C ACAGT GT T T GT GT T T G GAGCAT CT GCAACAGC C AT GGCCTTGCT GAC GAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 AT G C GAAT CACAGT GT T T GT GT T T GGAGCAT CT GCAACAGC CAT GG C CT T GCT GAC GAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i 

Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

Qy 1381 TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAA 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 TT CT AC CCT GGCT AT T AC C CT GAT GATAAT GGT AT AT AT AAT CAGAAAT T TC C ATT TAAA 1440 

Qy 1441 AC ACT T GC CAT GGT T AC AT CAT T CT T AAC C AAC AT T T GC AT C T C CT AT CT AG C C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 AC ACT T GC CAT G GT T AC AT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT CT AG C C AAGT AT 15 00 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

Qy 1561 CACAGT GAAGAAAAC AT GGATAAGACAAT T CTT GT CAAAAAT GAAAAT AT T AAATT AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 CACAGT GAAGAAAAC AT GGATAAGACAAT T CT T GT CAAAAAT GAAAAT AT TAAATT AGAT 162 0 

Qy 1621 GAACT T GCACTT GT GAAGC CAC GACAGAGCAT GACC CT C AGCT CAACTT T CAC CAAT AAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAACT T GCACT T GT GAAG C CAC GACAGAGCAT GACC CTC AGCT CAACTT T CAC CAAT AAA 1680 

Qy 1681 GAGGC CTT CCTT GAT GTT GATT CCAGT CCAGAAGGGT CT GGGACT GAAGAT AATTTACAG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGC CT T C CT T GAT GT T GAT T C C AGT C C AGAAG GGT CT GGGACT GAAG AT AAT T T AC AG 17 40 

Qy 1741 TGA 1743 

III 

Db 1741 TGA 1743 



RESULT 3 
ADD50638 

ID ADD50638 standard; cDNA; 1743 BP. 
XX 



AC ADD50638; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE cDNA encoding human high-affinity choline transporter (hCHT) . 
XX 

KW Human; high-affinity choline transporter; hCHT; chromosome 2ql2; 

KW cholinergic function; Parkinson's disease; Huntington's disease; 

KW Alzheimer 1 s disease; schizophrenia; dysautonomia ; myasthenia gravis; 

KW brain; cholinergic signalling; antiparkinsonian; anticonvulsant; 

KW nootropic; neuroprotective; neuroleptic; gene; ss. 

XX 

OS Homo sapiens. 



XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PAR S UN DARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR P-PSDB; ADD50639. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 2; SEQ ID NO 1; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence encodes hCHT . Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 



XX 
FH 
FT 
FT 
FT 



Key 
CDS 



Location/Qualifiers 
1. .1743 
/*tag= a 
/product^ "hCHT" 



CC web site at seqdata.uspto.gov. 
XX 

SQ Sequence 1743 BP; 412 A; 393 C; 406 G; 532 T; 0 U; 0 Other; 



Query Match 100.0%; Score 1743; DB 9; Length 1743; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I 

Db 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GTT G GAAT AT GGGCT GCCT GGAGAACCAAAAAC AGTGGC AG C GCAGAAGAGC GCAGC GAA 120 

Qy 121 GCC AT CAT AGTT GGT GGCCGAGATATTGGTTTATT GGTT GGT GGATT TAC CAT GACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I I I I I I M I I I I I I I I I 

Db 121 GC CAT CAT AGT T GGT GG C CGAGAT AT T G GTT TAT T GGT T GGT GGATT TAC CAT GACAGCT 18 0 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I 1 1 I 1 1 I I I 1 1 1 1 1 I 1 1 I I 1 1 1 1 I I I 1 1 1 I I I 1 1 1 1 1 1 I 1 1 i 1 1 I I I 1 1 1 I I I 1 1 1 1 1 

Db 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Qy 301 T T CT T T G C AAAAC C TAT G C G T T CAAAG G GGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

Qy 361 AT CT AT GGAAAAC GCAT GGGC GGAC T C CT GT T TAT T C C T GCACT GAT G GGAGAAAT GT T C 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 T GGGCT GCAGCAATTTT CTCT GCTT TGGGAGCCACCAT CAGCGT GAT CATCGAT GTGGAT 48 0 

Qy 4 81 AT GCACATTT CTGT CAT CAT CTCT GCACT CATT GCCACT CT GTACACACT GGT GGGAGGG 540 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 AT G CAC ATTT C T GT CAT CAT CT CT GCAC T CAT T GC CACT CT GTACACACT GGT GGGAG GG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 CAT G C CAAAT AC C AAAAGC C GT GGC T G GGAACT GTT GACT C AT CT GAAGT CT ACT CT T G G 720 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

Qy 721 CTT GATAGTTTTCT GTT GTT GAT GCT GGGT GGAATCCCAT GGCAAGCATACTTT CAGAGG 780 



Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I 1 1 1 1 1 I I I 1 1 1 I I I 1 1 1 1 I I I I 1 1 I 1 1 1 1 1 1 1 I I I I 1 1 1 1 I I I I I 1 1 1 I I I 1 1 1 1 1 I 1 1 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTC7VAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

901 AACCAGACT GCATATGGGCTTC CAGAT CCCAAGACTACAGAAGAGGCAGACAT GATTTTA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I 
901 AAC CAGACT GCAT AT GGGCT T C CAGAT CC CAAGACT ACAGAAGAGGCAGACAT GATT T TA 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

1081 C GGAAC AT C T AC C AGCTTT C CT T C AGACAAAAT G CT T CGGACAAAGAAAT CGT T T GG GTT 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 C GGAAC AT C T AC C AGCT TT C CT T CAGACAAAAT GCT T CGGACAAAGAAAT CGT T T GG GTT 1140 

1141 AT GCGAAT CACAGT GT T TGT GT T T GGAGCAT CT GCAACAG C CAT GGC CTT GCT GAC GAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I ! I I I 

1141 AT G C G AAT CACAGT GTTTGTGTTTG GAG CAT C T G C AAC AG C CAT GGCCTTGCT GAC GAAA 1200 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I 1 1 I I I I 1 1 I I I I I I 1 1 I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I 
1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

1381 T T CT AC C CT GGCT AT T ACC C T GAT GAT AAT GGT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1381 T T CT AC C CT GGCT AT T AC C C T GAT GAT AAT GGT AT AT AT AAT CAGAAAT T T C CATT T AAA 14 40 

1441 ACACT T GC CAT GGT T AC AT CAT T CT T AAC C AAC ATT T GCAT CT C CT AT CT AGCC AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 ACACTT GC CAT GGT T ACAT CATT CT TAAC CAAC ATT T GC AT CT C CT AT CT AG C CAAGT AT 1500 

1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

1561 CACAGT GAAGAAAAC AT G GAT AAG AC AAT T CTT GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I 



Db 1561 C AC AG T GAAG AAAAC AT G G AT AAGAC AAT T C T T GT C AAAAAT GAAAAT AT T AAAT TAG AT 1620 



Qy 1621 GAACTT GCACTTGTGAAGCCACGACAGAGCATGACC CT CAGCTCAACTTT CACCAATAAA 1680 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 GAACTT GCACTTGT GAAGCCAC GACAGAGCATGACCCTCAGCT CAACTTT CACCAATAAA 1680 

Qy 1681 GAGGC CTT C CTTGAT GT T GATT C CAGT C CAGAAGGGT CT G GGACT GAAGAT AAT T T ACAG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II 
Db 1681 GAGGCCTTCCTTGATGTTGATTCCAGTCCAGAAGGGTCTGGGACTGAAGATAATTTACAG 174 0 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 

RESULT 4 
ADD50646 



ID ADD50646 standard; DNA; 1813 BP. 
XX 

AC ADD50646; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE High-affinity choline transporter (CHT) associated DNA sequence #2. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; ds . 

XX 

OS Unidentified. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 9; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 



CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively) , and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer f s disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present DNA sequence of unknown function is provided in 

CC the electronic sequence data but is not mentioned in the printed 

CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 1813 BP; 440 A; 406 C; 417 G; 550 T; 0 U; 0 Other; 



Query Match 100.0%; Score 1743; DB 9; Length 1813; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 174 3; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 19 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG. 78 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 9 GT T GGAATAT GGGCT GC CT GGAGAAC C AAAAAC AGT GGC AGC G CAGAAGAGC GCAG C GAA 138 

Qy 121 GCCAT CATAGTT GGT GGCCGAGATATTGGTTTATTGGTT GGT GGATTTACCAT GACAGCT 180 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 139 GC CAT CATAGTT GGTGGC CGAGAT AT TGGTTT ATT GGTT GGT GGATTTACCAT GACAGCT 198 

Qy 181 AC CT GGGT C GGAGGAG GGT AT AT CAAT GGCACAGCT GAAG CAGTTT AT GT ACC AGGT TAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II ! I I I I I I I I I 
Db 199 AC CTGGGT CGGAGGAGGGT ATAT CAAT GGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 258 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 259 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 318 

Qy 301 T T CT T T GCAAAACCTAT GC GT T CAAAGGG GTAT GT GACC AT GT T AGAC C C GT T T C AGCAA 360 

I I I I I I I I |'| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 319 T T CTT T GCAAAAC CT AT GCGTT CAAAGG GGT AT GT GAC CAT GT TAGAC C C GT T T C AGCAA 378 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 37 9 AT CT AT G GAAAAC GCAT GGGC GGACT CC T GT T TAT T CCT GCACTGATGGGAGAAAT GT T C 4 38 

Qy 421 T GGGCT GCAGCAAT T TT CT CT GCTT T G GGAGC CAC CAT CAGC GT GAT CAT C GAT GT GGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 439 T GGGCT G CAGCAAT T TT CT CT GCTT T GG GAG C CAC CAT CAGC GT GAT CAT C GAT GT GGAT 4 98 



Qy 

Db 



481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 99 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 558 



Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 559 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 618 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 619 AT C AGC GT CC C CTT T GC ATT GT CACAT C CT GCAGT CGCAGACAT C GGGT T CACT G CT GT G 678 

Qy 661 CAT GCCAAATACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CTT GG 720 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 67 9 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 738 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I | | | I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 739 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 798 

Qy 7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 99 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 858 

Qy 841 T GC CT GGT GAT GGC CAT CCCAGCCATACT CATT GGGGCCATT GGAGCAT CAACAGACT GG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I II I I I I I I I 
Db 859 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 918 

Qy 901 AAC CAGACT GCATAT GGGCTT CCAGAT CCCAAGACTACAGAAGAGGCAGACAT GATTTT A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 919 AAC CAGACT GC AT AT GG G CT T CC AGAT CC CAAGACT ACAGAAGAGGC AGAC AT GATT T T A 978 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

I I I I 1 1 1 1 i I I I 1 1 I I I 1 1 1 1 1 I I 1 1 1 I I 1 1 1 1 1 I I 1 1 I 1 1 1 1 I 1 1 I I I I I I I I I 1 1 1 I I 

Db 979 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1038 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 108 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1039 T CT GCT G CT GTT AT GT CAT CAGC AGAT T CT T CC AT CTT GT CAGCAAGT T C CAT GT TT GCA 1098 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 

I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1099 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1158 

Qy 1141 AT G C GAAT C ACAGT GT T T GT GT T T G GAGC AT CT GCAAC AGC C AT GGCCTTGCT GAC GAAA 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 1159 AT G C GAAT C ACAGT GT T T GT GT T T GGAG CAT CT GCAAC AGC CAT GGCCTTGCT GAC GAAA 1218 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 1219 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 127 8 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 1279 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1338 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 1339 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1398 



Qy 1381 
Db 1399 



T T CT AC C CT GG CT AT TAC C CT GAT G AT AAT G GT AT ATATAAT CAGAAAT T T C C ATT TAAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T T C TAC C C T G G C TAT TAC C CT GAT GAT AAT G GT AT AT AT AAT CAGAAAT T T C CAT T TAAA 14 58 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1441 AC ACT T GC CAT G GTTACAT CAT T CT T AAC CAAC AT T T GC AT CT C CT AT C TAGC CAAGTAT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1459 AC ACT T GC CAT G GT TAC AT CAT T C T T AAC CAAC AT T T GC AT CT C CT AT C TAG C CAAGTAT 1518 

1501 C T AT T T GAAAGT G GAAC C T T GC C AC C T AAAT T AGAT GT AT T T GAT GCTGTTGTTG C AAG A 1560 

I I i I I I I 1 1 1 1 I I I 1 1 1 1 I 1 1 1 1 1 1 I I I 1 1 1 I I I 1 1 1 1 1 1 1 I 1 1 I I I 1 1 I 1 1 I 1 1 I 1 1 1 1 

1519 CT AT T T GAAAGT GGAACC T T GC CAC CT AAATT AGAT GTAT TT GAT GCTGT T GT T GCAAGA 1578 

1561 C AC AGT GAAGAAAAC AT G GAT AAGACAAT T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1579 CAC AGT GAAGAAAAC AT G GAT AAGACAAT T C T T GT C AAAAAT GAAAAT AT T AAAT T AGAT 1638 

1621 GAACT T G CAC T T GT GAAGC CAC GAC AGAGC AT GAC C CT C AG CT CAACT T T CAC CAAT AAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1639 GAACT T GCACT T GT GAAGC CAC GAC AGAGCAT GAC C CT C AGCT CAACTT T CAC CAAT AAA 1698 

1681 GAGGCCTT CCTT GATGTT GATT CCAGTCCAGAAGGGTCT GGGACTGAAGATAATTTACAG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1699 GAGGC CT T C CT T GAT GT T GAT T C C AGT C C AGAAGGGT C T GGGACT GAAGAT AAT T TAC AG 1758 

1741 TGA 1743 
I I I 

1759 TGA 1761 



RESULT 5 
ABX94338 

ID ABX94338 standard; cDNA; 1743 BP. 
XX 

AC ABX94338; 
XX 

DT 13-JUN-2003 (first entry) 
XX 

DE Human cDNA encoding high affinity choline transporter, HACT. 
XX 

KW Human; ss; gene; HACT; high affinity choline transporter; pain; 

KW neurotransmitter biosynthesis; learning and memory; aging; epilepsy; 

KW neurological disorder; spasticity; myoclonus; muscle spasm; 

KW muscle hyperactivity; stroke; head trauma; neuronal cell death; 

KW multiple sclerosis; spinal chord injury; dystonia; Alzheimer's disease; 

KW Myasthenia Gravis; multi-inf arct dementia; AIDS dementia; 

KW Parkinson's disease; Huntington's disease; amyotrophic lateral sclerosis; 

KW ALS; attention deficit disorder; organic brain syndrome; schizophrenia; 

KW nicotine addiction; memory disorder; cognitive disorder. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1743 

FT /*tag= a 

FT /product^ "HACT" 

XX 

PN US6500643-B1. 



PD 31-DEO2002. 
XX 

PF 07-SEP-2000; 2000US-00657252 . 
XX 

PR 07-SEP-2000; 2000US-00657252 . 
XX 

PA (UYFL ) UNIV FLORIDA. 
XX 

PI Wu D, Gu Y, Millard WJ, He Y; 
XX 

DR WPI; 2003-361535/34. 

DR P-PSDB; ABU08979. 
XX 

PT Novel isolated polynucleotide (I) that encodes high affinity choline 

PT transporter protein, useful for preventing, treating or ameliorating 

PT neurological and cognitive disorders such as Alzheimer's or Parkinson's 

PT disease. 
XX 

PS Claim 2; Col 17-21; 20pp; English. 
XX 

CC The invention relates to an isolated polynucleotide which encodes a high 

CC affinity choline transporter (HACT) protein appearing as ABU08979. Also 

CC included are a polynucleotide encoding a fragment consisting of at least 

CC about 50 amino acids of the HACT protein, a vector comprising the 

CC polynucleotide, a composition comprising a vector comprising a 

CC polynucleotide which comprises at least about 12 contiguous nucleic acids 

CC of a polynucleotide appearing as ABX94339 (encoding choline 

CC acetyltransf erase) , a recombinant host cell which comprises the vector 

CC (used to express the HACT protein or fragment) . The polynucleotide is 

CC useful as a probe or primer to detect the presence of HACT polynucleotide 

CC in a sample, such as a biological sample, or for screening for test 

CC agents which bind to the polynucleotide. A pharmaceutical composition 

CC comprising the polynucleotide is useful for preventing, treating or 

CC ameliorating neurological and cognitive disorders e.g. pain, spasticity, 

CC myoclonus, muscle spasm, muscle hyperactivity, epilepsy, stroke, head 

CC trauma, neuronal cell death, multiple sclerosis, spinal chord injury, 

CC dystonia, Alzheimer's disease, myasthenia gravis, multi- infarct 

CC dementia, AIDS dementia, Parkinson's disease, Huntington's disease, 

CC amyotrophic lateral sclerosis (ALS) , attention deficit disorder, nicotine 

CC addiction, organic brain syndromes, schizophrenia or memory and cognitive 

CC disorders. HACT is thought to be the rate limiting step in cholinergic 

CC neurotransmitter biosynthesis and regeneration (cholinergic transmissions 

CC are crucial to brain functions such as learning and memory) . The present 

CC sequence encodes human HACT 

XX 

SQ Sequence 1743 BP; 411 A; 395 C; 405 G; 532 T; 0 U; 0 Other; 

Query Match 99.7%; Score 1738.2; DB 8; Length 1743; 
Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1740; Conservative 0; Mismatches 3; Indels 0; Gaps 0 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I i I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 



QY 



61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 61 GTT GGAAT AT G GG CTG C CT GGAGAAC CAAAAAC AGT G G C AG CGC AGAAGAGC GCAGC GAA 120 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 181 AC CT GGGT CGGAG GAG G GT ATAT CAAT GGCACAGCT GAAGCAGT T TAT GT AC CAGGTT AT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 T GG GCT GCAG CAAT TT T CT CT GCT TT GGGAGC C ACC AT C AGCGT GAT CAT C GAT GT GGAT 48 0 

Qy 481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I 

Db 481 AT GCACATTT CT GT CAT CAT CT CT GCACTC ATT GCCACT CT GT ACACACT GGT GGGAGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Qy 661 CAT GCCAAATACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CTT GG 720 

I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

Qy 721 CTT GAT AGT TTTCTGTTGTT GAT G CT GGGT GGAAT C C CAT G GC AAGC AT ACT T T CAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CTT GAT AGT TTTCTGTTGTT GAT GCT GGGT G GAAT C C CAT GG C AAG CAT ACT T T CAGAGG 78 0 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

Qy 841 T GCCT GGT GAT GGCCAT C CCAGCCAT ACTCATT GGGGCCATT GGAGCAT CAACAGACT GG 900 

I I I I I I I I I II I I I II I I I II I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCCTCCACAGACTGG 900 

Qy 901 AAC CAGACT GC AT AT GGGCTT C C AGAT CC CAAGACT ACAGAAGAGGC AGACAT GAT T T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I II I 



Db 901 AAC C AGACT GC AT AT GGGCT T C CAGAT CC CAAGACT ACAGAAGAGGC AGACAT GAT T T T A 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 108 0 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I 
Db 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 114 0 

I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

I I I I I I It I I I I I I I I I I I I I I I I 1-1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 AT G C GAAT C AC AGT GT T T GT GT T T G GAGC AT CT GC AAC AGC CAT GG C CT T GCT GAC GAAA 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTT7\AGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 138 0 

Qy 1381 T T C T AC C CT G G C T AT T AC C CT GAT G AT AAT GG T AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAA 1440 

Qy 1441 AC ACT T GC C AT GGTT AC AT CAT T CT T AAC C AAC AT TT G CAT CT C CT AT CT AG C CAAGTAT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 AC ACT T G C CAT G GT T AC AT CAT T CT T AAC C AAC AT T T GC AT CT C CT AT CT AGC CAAGTAT 1500 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

Qy 1561 CACAGTGAAGAAAACATGGATAAGACAATTCTTGTCAAAAATGAAAATATTAAATTAGAT 1620 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 C AC AGT GAAG AAAAC AT G GAT AAGAC AAT T CT T GT C AAAAAT GAAAAT AT T AAAT TAG AT 1620 

Qy 1621 GAACT T GC ACT T GT GAAGC CAC GAC AGAGC AT GAC C CT C AGCT C AACT T T C AC CAAT AAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAACTTGCACTT GTGAAGCCAC GACAGAGCAT GACCCTCAGCT CAACTTT CAC CAAT AAA 1680 

Qy 1681 GAGGCCTT CCTT GATGTTGATTCCAGT CCAGAAGGGTCT GGGACTGAAGATAATTT ACAG 1740 

I I I I I I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAGGC CT T C CT T GAT GTT GAT T C C AGT C C AGAAG GGT CT GG GACT GAAG AT AAT T T ACAA 1740 

Qy 1741, TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 6 
AAF81711 

ID AAF81711 standard; cDNA; 1743 BP. 
XX 

AC AAF81711; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Rat high affinity choline transporter protein encoding cDNA. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis; 

KW ss . 

XX 

OS Rattus norvegicus. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1743 

FT /*tag- a 

FT /product= "high affinity choline transporter" 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP00554 5 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR P-PSDB; AAB74664. 
XX 

PT New rat and human spinal cord high affinity choline transporters,, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 6; Page 64-68; 90pp; Japanese. 
XX 

CC The present sequence encodes a rat (Rattus norvegicus) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 1743 BP; 414 A; 402 C; 404 G; 523 T; 0 U; 0 Other; 



Query Match 



80.0%; Score 1394.2; DB 4; Length 1743; 



Best Local Similarity 87.5%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 218; Indels 0; Gaps 0; 

1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I II I I Ml 

1 ATGCCTTTCCATGTAGAAGGACTAGTAGCGATTATCCTGTTCTACCTTCTTATATTTCTG 60 

61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I 

61 GTT GGAAT AT GG GCT GC AT GGAAAACCAAAAAC AG C G GTAAT G C AGAAGAACGCAGC GAA 120 

121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I I II I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I 

121 GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 

181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 240 

I I I I I I I I I I I I I I I I I i Mill II I II I I I I I I I I I I I I I I I I I I I I I I I I 
181 ACC T GGGT T GGAG GAG GT T ACAT CAAC GGGACAG CT GAAGC AGT T TAT GGGC CAGGT T GT 240 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

241 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 300 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I I I I II I I I I I I I I I I I I I I I I I MINIM I II I II I I I I I I I M I I II 
301 TTTTTT GCAAAAC CT AT GCGT T CCAAGGGAT AT GT GACT AT GTTAGACCCGTTT CAAC AG 360 

361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

I I II I I I I M I I I II II II II II Mill II I I II I I II I M I II I I I II M I I 

361 ATCTATGGAAAGCGCATGGGTGGGCTGCTGTTCATCCCTGCACTGATGGGAGAGATGTTC 420 

421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

II I II I I II I M I M II II I I II M II II I I II I II I I I I I II II I II I I I I M 

421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCTACCATCAGCGTAATCATTGATGTGGAT 480 

481 AT GCACATT T CT GT CAT CAT CT CT GCACT C ATT GCC ACT CT GT ACACACT GGT GGGAGGG 540 

II MM II II I I I MM M I II I II M I M III M II II I I I I I I I I I 

481 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 540 

54 1 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

II I I I I II I I II I II M I I I II I II I II I I II I II II II I I II II I I I II 
541 CT CTACT CT GT GGCAT AT ACT GAT GT T GT AC AG CT AT T CT G CAT TT T T AT AGGAT T GT GG 600 

601 AT C AGC GTCCCCTTTG CAT T GT C ACAT C CT GCAGT C GC AGACAT C GGGT T C ACT GCT GT G 660 

I I I I I I I I II I II II I I M II I II II II I I I I I Mill II I I I I II I I I I M 
601 AT C AGT GT C C CAT TTGCCCTGT CACAT C CT GCAGT C AC C GAC ATT G GAT T C ACT G CT GT G 660 

661 CAT GCCAAATACCAAAAGCCGT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACTCTT GG 720 

I II II I I I I I I I I I II II II II I I II I' I II I III I I I I II I I I I I III 

661 CAT GCT AAAT AC CAGAGT C C CT GGCT GGGAAC CAT T GAAT C AGT T GAAGT CTAC AC CTGG 720 

721 CTT GAT AGT TTTCTGTTGTT GAT GCT GGGT GGAAT C C CAT GGC AAG CAT ACT TT C AGAG G 780 

II I II I I I M I I II II II I I I I I II I I I I I I II I I I I I I I I I I I I II I I I II I M I 
721 CTT GAT AAT TTTCTGTTGTT GAT GCT G G GT GGAAT AC CAT GG CAAGC CTACT T C C AGAG G 7 80 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 
II II I M I I I II Mill M I I I II II II I II I II I II II I II I I II I II I I Ml 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



781 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 84 0 



841 T GCCT G GT GAT GGC CAT C C CAGC CAT ACT CAT T GG G GC C ATT GGAGCAT CAACAGACT GG 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

841 TGCCTGGTGATGGCTCTACCAGCCATTTGCATTGGGGCCATTGGAGCCTCCACAGACTGG 

901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 

I 1 I I I I I I I I I I I I I I T I I I I I I I I I I I I I I I I I II M I I I I I II I I I I I I 
901 AACCAAACT GC AT AT GGGTTT C CAGAT C C CAAGAC CAAGGAGGAAGCAGACAT GAT T CT C 



900 



900 



960 



960 



961 



CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

961 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 TCTGCTGCTGT CAT GT CC T CGGCT GACT CAT C CAT C CT AT CAGCAAGTT C CAT GT T T GCT 108 0 

1081 C GGAACAT CTAC CAG CTT T C CT T C AGACAAAAT GCT T C GGACAAAGAAAT CGT TT GGGT T 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I 
1081 CGGAATAT CTAC CAGCTTTCCTTCAGACAAAATGCATCAGACAAGGAAATTGT GT GGGT C 1140 

1141 AT GC GAAT CAC AGT GT TT GT GT TT GGAGCAT CT G CAAC AGC CAT GG C CT T GCT GAC GAAA 1200 

Ml I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 ATGAGGATCACTGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTCACGAAG' 12 00 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

1201 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1260 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II II I M I I I II 
1261 CT GCT CT GT GT ACT CTT CAT CAAAGGAACCAACACTTAT GGGGCAGT T GCT GGTTATATT 1320 

1321 TCTGGCCTCTTCCT GAGAAT AACT GGAG G G GAG C CAT AT CT GT AT CTT C AGC C CT T GAT C 1380 

I III II I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I II I III 
1321 T TT GGACT TT T CCT GAGAAT TACC G GAGGAGAGC CAT AT CT AT ACT T GC AG C C CT T AAT C 1380 

1381 T T CTAC C C T G G CT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
1381 T T CT ACC CTGGTTAT T AC CCT GACAAGAAT GGT AT AT ACAAT CAGAGGT T C C CAT T T AAA 144 0 

14 41 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCCTATCTAGCCAAGTAT 1500 

1501 CTATTTGAAAGTGGAACCTT GCCACCTAAATTAGATGT ATTT GAT GCT GTT GTT GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1501 CTAT T T GAAAGTG GAAC CTT GCCT C CAAAAT T AGAT AT AT T T GAT GCT GTT GT CT CAAGG 1560 

1561 CAC AGT GAAGAAAACAT GGAT AAGACAATT CTT GT CAAAAAT GAAAAT ATTAAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 
1561 CAC AGT GAAGAG AAC AT G G AC AAG AC CAT T C T AGT C AGAAAT G AAAACAT C AAAT T AAAT 1620 

1621 GAACTT G CACT T GT GAAG C CAC GACAGAGCAT GACCCT C AGC T CAACT TT C AC CAAT AAA 1680 

I I I I I I I II I III I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I 
1621 GAACTT GCAC CTGTAAAGC CT C GACAGAG C CT AACC CT CAGT T CAACT TT CAC CAAT AAA 1680 



Qy 1681 GAGG C CTT C CT T GAT GTT GAT T CCAGT C C AGAAG GGT CT GG GACT GAAGAT AATT T ACAG 1740 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > 
Db 1681 GAG GCT CT CCT T GAT GTT GAT T CCAGT C CAGAGG GAT CT GGGACT GAAGAT AACT TACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 7 
ADD50642 

ID ADD50642 standard; cDNA; 4904 BP. 
XX 

AC ADD50642; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE cDNA encoding rat high-affinity choline transporter (rCHT) . 
XX 

KW Rat; high-affinity choline transporter; rCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; gene; ss. 

XX 

OS Rattus sp. 
XX 

FH Key Location/Qualifiers 

FT CDS 224. .1966 

FT /*tag- a 

FT /product^ "rCHT " 

XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
DR P-PSDB; ADD50643. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 
PT choline transporter polypeptide, useful in gene therapy to increase 
PT cholinergic function in a cell of a patient suffering from Alzheimer's 
PT disease. 
XX 

PS Example 1; SEQ ID NO 5; 74pp; English. 
XX 



CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington f s disease, Alzheimer f s disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence encodes rat CHT (rCHT) . Note: The 

CC sequence data for this patent was obtained in electronic format directly 

CC from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 4904 BP; 1447 A; 991 C; 939 G; 1527 T; 0 U; 0 Other; 



Query Match 80.0%; Score 1394.2; DB 9; Length 4904; 

Best Local Similarity 87.5%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 218; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 224 ATGCCTTTCCATGTAGAAGGACTAGTAGCGATTATCCTGTTCTACCTTCTTATATTTCTG 283 

Qy 61 GTT GGAATAT GGGCT GCCT GGAGAACCAAAAACAGT GGC AGCGCAGAAGAGC GCAGC GAA 120 

I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I II 
Db 284 GTTGGAATATGGGCTGCATGGAAAACCAAAAACAGCGGTAATGCAGAAGAACGCAGCGAA 343 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 18 0 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 344 GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 4 03 

Qy 181 AC CT G GGT C GGAGGAGGGT AT AT CAAT GGCACAGCT GAAGCAGTT T AT GT AC C AGGTT AT 24 0 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 04 ACCTGGGTTGGAGGAGGTTACATCAACGGGACAGCTGAAGCAGTTTATGGGCCAGGTTGT 4 63 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 464 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 523 

Qy 301 T T CT T T G C AAAAC CT AT G C GT T C AAAG G G GT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

II I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I II I I I I I I I I II 

Db 524 TTTTTTGCAAAACCTATGCGTTCCAAGGGATATGTGACTATGTTAGACCCGTTTCAACAG 583 

Qy 361 ATCTATGGA7\AACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I I I I I I I I I I I I I I I I I II II Mill II I I I I I I I I I II I I I I I I I I I I I I 
Db 584 AT CTAT G GAAAGC G CAT GGGT GGGCT GCT GT T CAT CCCT GCACT GAT GG GAGAGAT GT T C 643 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I I I I I I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I II I I I I I I I 

Db 644 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCTACCATCAGCGTAATCATTGATGTGGAT 703 

Qy - 4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

II I I I I II I I I II I I I I I I I I I I I I I I I I I III II II II I I I I I I I I I 
Db 704 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 7 63 



Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 60 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 64 C T C TACT C T GT G GCAT AT ACT GAT GT T GT AC AG C TAT T C T GC AT T T T TAT AG GAT T GT G G 82 3 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I 1 I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 824 ATCAGTGTCCCATTTGCCCTGTCACATCCTGCAGTCACCGACATTGGATTCACTGCTGTG 883 

Qy 661 CAT GCCAAAT AC CAAAAGC CGT GGCT GGGAACT GTT GACT CAT CT GAAGT CTACT CTT GG 720 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 8 84 CAT GCTAAAT AC CAGAGT C CCT GGCTGGGAAC CATT GAAT CAGTT GAAGT CTACAC CT GG 943 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 944 CTT GAT AAT TTTCTGTTGTT GAT GCT GGGT GGAATAC CAT GGCAAGC CT ACTT C CAGAGG 1003 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1004 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 1063 

Qy 841 T GC CT GGT GAT GG C CAT C C C AG C CAT ACT CAT T GG G G C CATT GGAG CAT C AAC AGACT GG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 1064 T GCCT GGT GAT GGCT CT AC CAG C CAT TT GCAT T GG G G C CATT G GAG CCTC CAC AGACT GG 1123 

Qy 901 AAC CAGACT GC AT AT GGG CT T C C AGAT C C C AAGACT AC AGAAGAGG C AGAC AT GAT TT T A 960 

I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1124 AACCAAACT GCAT AT GGGT T T C CAGAT CC CAAGAC CAAGGAGGAAGCAGACAT GATT CT C 1183 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I II! 

Db 1184 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1243 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 108 0 

I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1244 TCTGCTGCTGTCATGTCCTCGGCTGACTCATCCATCCTATCAGCAAGTTCCATGTTTGCT 1303 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 

Db 1304 C GGAAT AT CT AC CAG CTTT CCT T CAGACAAAAT GCAT CAGACAAGGAAAT T GT GT GGGT C 1363 

Qy 1141 AT GCGAAT C AC AGT GTTTGTGTTTG GAG CAT CT G C AAC AG CC AT GGCCTTGCT GAC G AAA 12 00 

III I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I II I I II I I 

Db 1364 AT GAGGAT CACT GT GT TT GT GT T T GGAG CAT C T GCAACAGCCAT GGCCT T GCT CAC GAAG 1423 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I III 

Db 1424 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 14 83 

Qy 12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1484 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1543 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III II I I I I I I I I I I I II Mill I I II I I I I I I I II I I I I I I I I I III 

Db 1544 TT T GGACT T TT C CT GAGAAT T AC CG GAGGAGAGCCAT AT CT AT AC T T GCAGC C CTT AAT C 1603 



Qy 

Db 



1381 TTCTACCCTGGCTATTACCCT GAT GATAAT GGTAT ATATAAT CAGAAATTT CCATTTAAA 14 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II MINIMI 
1604 T T CT AC C CT GGT TAT T AC CCT GACAAGAAT GGTAT AT ACAAT CAGAGGT T C C CAT T TAAA 1663 



Qy 1441 AC ACT T GC CAT GGT T ACAT CAT T CT T AACCAAC AT TT GCAT CT C CT AT CT AG C C AAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1664 ACTCTCTCCATGGTTACCTCATTCTTTACC7VACATTTGTGTTTCCTATCTAGCCAAGTAT 1723 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I II I I I I I I I I I I I II I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1724 CT AT T T GAAAGT G GAAC CTT GCCT C C AAAAT T AGATATATTT GAT GCT GTT GT CT CAAGG 1783 

Qy 1561 CACAGTGAAGAAAACAT GGAT AAGACAATT CTT GT CAAAAAT GAAAAT ATT AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 

Db 17 84 CAC AGT GAAGAGAACAT GGACAAGACCATT CT AGT CAGAAAT GAAAACAT CAAATT AAAT 1843 

Qy 1621 GAACTT GC ACTT GT GAAGC CACGACAGAGCAT GACCCT CAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1844 GAACTTGCACCTGTAAAGCCTCGACAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAA 1903 

Qy 1681 GAG GC CT T C CT T GAT GT T GAT T CCAGT C CAGAAGGGT CT GG GACT GAAGATAATT T AC AG 1740 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 1904 GAGGCT CT CCTTGATGTT GATTCCAGT C CAGAGGGAT CTGGGACTGAAGATAACTTACAA 1963 

Qy 1741 TGA 1743 

I II 

Db 1964 TGA 1966 



RESULT 8 
ADD50640 

ID ADD50640 standard; cDNA; 1743 BP. 
XX 

AC ADD50640; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE cDNA encoding mouse high-affinity choline transporter (mCHT ) #1. 
XX 

KW Mouse; high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinson's disease; Huntington f s disease; Alzheimer f s disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; gene; ss. 

XX 

OS Mus sp . 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1743 

FT /*tag= a 

FT /product= "mCHT #1" 

XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 



XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR P-PSDB; ADD50641. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 30; SEQ ID NO 3; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinant ly. The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence encodes mCHT . Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 1743 BP; 406 A; 409 C; 410 G; 518 T; 0 U; 0 Other; 

Query Match 78.9%; Score 1375; DB 9; Length 1743; 

Best Local Similarity 86.8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0 



Qy 


1 


ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


60 




III 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 MINIM II II II III 




Db 


1 


ATGCCTTTCCATGTGGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 


60 


Qy 


61 


GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 


120 




1 1 1 1 1 1 1 1 1 1 II 1 II 1 II 1 1 1 1 1 1 1 1 1 MM 1 III M M 1 III 




Db 


61 


GT T GGAAT AT G G GCT GCAT GGAAAAC CAAAAACAGC GGCAAC C CAGAAGAGC G CAGT GAA 


120 


Qy 


121 


GCCAT CATAGTT GGT GGCCGAGATATTGGTTT ATT GGTT GGTGGATTTACCAT GACAGCT 


180 




I I I I I I 1 M II II Mill II 1 II M 1 1 1 II 1 1 1 1 1 M 1 1 1 1 M 1 II 1 1 II II 1 




Db 


121 


GC CAT C AT AGT C GGGGGC C GT GACAT TGGTTTGTTGGTTGGTGGTTT T AC CAT GAC AGC C 


180 



Qy 
Db 



181 
181 



AC CT GGGT C GGAGGAG GGT AT AT CAATGG CAC AGCT GAAG C AGTT T AT GT AC CAG GT TAT 

M II M I I I M I M II II II I I I M I M II I I I I II M I I II I M II II I I 

ACCT GGGTT GGAGGAGGCTACAT CAATGGGACAGCAGAAGCAGTGTATGGGC CAGGT TGT 



240 
240 



Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I Ml 

Db 241 GGTCTAGCTTGGGCTCATGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 

Db 301 TTTTTTGCGAAACCTATGCGTTCCAAGGGATATGTGACTATGTTAGACCCATTCAAACAG 360 

Qy 361 AT CT AT GGAAAACGCAT GGGCGGACT CCT GTTTATT CCT GCACT GAT GGGAGAAAT GTT C 42 0 

I I II I I I I I I I I I I I I I I I II II II II II I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 420 

Qy 421 T G G GCT G C AGCAAT TTTCTCTGCTTT G GGAGC C AC CAT C AGC GT GAT CAT C GAT GT G GAT 48 0 

I I I II I I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 T GGGC T GCAG C AATT T T CT CT GC ATT AGGG GC C AC CAT C AGC GT GAT CAT T GAT GT G GAT 480 

Qy 481 AT GCACATT T CT GT CAT CAT CT CT GCACT CAT T GC C ACT C T GT ACAC ACT GGT GGGAGGG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I M I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I MINI 
Db 601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

Qy 661 CAT GC CAAAT AC CAAAAG C CGT GGCT GGGAAC T GT T GACT CAT CT GAAGT CT ACT C T T GG 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I 
Db 661 CAT G C T AAAT AC C AGAGT CCCTGGCTGG GAAC CAT T GAAT C AGT T GAAGT CT AC AC CT G G 72 0 

Qy 721 CT T GAT AGT TT T CT GT T GT T GAT GCT GGGT GGAAT C CCAT GGCAAG CAT ACT T T C AGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II I I I I II I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I Ml 

Db 7 81 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 84 0 

Qy 841 T GCCT GGT GAT GGCCATCCCAGCCATACT CATTGGGGCCATT GGAGCAT CAACAGACTGG 900 

I I II I I II I I I I M I I I I I I I I I I I I I I II I I II I I I I II I I I II I II I 
Db 841 T GCCT GGT GAT GGCT CTACCCGCCATAT GCATAGGAGCTATT GGAGCTT CCACAGACTGG 900 

Qy 9 01 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

I M I I I I I I I I II Ml I M II M I I II I I II II II II II I I I I I I I II I I 

Db 901 AAC C AGACT GC CT AC GG GT AT C CAGAT C C CAAGACT AAG GAGGAAGC AGAC AT GAT T CT C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I II II I I II M II I M I M II I I I I II II I I I I I I II I I I I I I II III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGT TAT GT CAT C AG CAGAT T CT T C CAT CT T GT CAGCAAGT T C CAT GT T T GC A 1080 

II I I II M I I I I I I I Mill M II I I M I I MM II Mill I I I II I I I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 



Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 



Db 



1081 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I II Mill 
C GGAAT AT CT ACC AGCT T T C CTT CAGACAAAAT GCAT C AGACAAG GAAAT T GT GT GGGT C 



1140 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1141 AT GC GAATCAC AGT GTT T GT GTT T GGAGCAT C T GCAACAG C CAT G GC CT T G CT GAC GAAA 1200 

III I I I II I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 AT GAGGAT CACT GT GCT T GT GTT C GGAGCAT CT G CAACAGC CAT G GCT T T GCT GAC GAAG 1200 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I III 
1201 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 12 60 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 132 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II II I I I I I I II 
1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I 
1321 T TT GGACT AT T C CT GAGAAT TACT GGAGGAGAGC CAT AT CT ATACTT GC AGC C CT TAAT C 1380 

1381 T T CT ACC CT GGCT ATT AC CCT GAT GAT AAT GGTAT AT AT AAT CAGAAAT T T C CAT T T AAA 14 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
1381 T T CT AC C CT GGT TAT T AC T CT GACAAGAAT GGTAT AT ACAAT CAGAG GT T C C CAT T T AAA 1440 

1441 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I I Mill. 

1501 CT AT T T GAAAGT GGAAC C TT G C CT CCAAAAT TAGAT GTATTT GAT GCT GT T GT C GCAAG G 1560 

1561 C AC AGT GAAGAAAAC AT G GAT AAG AC AAT T C T T GT C AAAAAT G AAAAT AT T AAAT TAGAT 1620 

I I I II I I I I II I II I I I I I I I I II II I II II II I I II I I M I II I I I I I I I II 
1561 C AC AGT GAAGAGAAC AT G GAC AAGAC CAT T CT AGT CAGAAAT G AAAAT AT C AAAT T AAAT 1620 

1621 GAACT T GC ACT T GT GAAG C C AC GAC AGAGC AT GAC CCT CAG CT CAACT T T C AC CAATAAA 1680 
I M II I I I I I I I II I I II II II I M I I I I I I II I I I I I I I II I I I I I II I I I 

1621 GAACTTGCACCTGTGAAACCTCGGCAGAGCCTAACCCTCAGTTCAACTTTCACCAATAAG 1680 

1681 GAGGC CT T CCT T GAT GT T GAT T C C AGT C CAGAAGGGT CT GGGACT GAAGAT AATT T ACAG 1740 

I II II I I I II I I II M II I II I II II II II I I I I II I I I I I I II II I I II I I II I I 
1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAATTTACAA 174 0 

1741 TGA 1743 
I I I 

1741 TGA 1743 



RESULT 9 
ADD50660 

ID ADD50660 standard; cDNA; 1743 BP. 
XX 

AC ADD50660; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE cDNA encoding mouse high-affinity choline transporter (mCHT) #2. 



KW Mouse; high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; gene; ss. 

XX 

OS Mus sp . 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1743 

FT /*tag= a 

FT /product^ "mCHT #2" 

XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2 001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PARS UN D ARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR P-PSDB; ADD50661. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Example 4; SEQ ID NO 23; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2 . The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence encodes mCHT . Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 1743 BP; 406 A; 409 C; 410 G; 518 T; 0 U; 0 Other; 



Query Match 



78.9%; Score 1375; DB 9; Length 1743; 



Best Local Similarity 86,8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0; 



Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I I I I I I I I I II I II I I I I I I I Ml I I I I I I I I I II II Mill 
Db 1 ATGCCTTTCCATGTGGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

Qy 61 GTT G GAAT AT GGGCT GCCT GGAGAACCAAAAACAGT GGC AG C GCAGAAGAG C GCAGC GAA 12 0 

I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 61 GTT G GAAT AT GGGCT GCAT G GAAAAC CAAAAACAGCGGCAAC CCAGAAGAG C GCAGT GAA 120 

Qy 121 G C CAT C AT AGT TGGTGGCC GAGAT AT T G GT T TAT TGGTTGGT GGAT T T AC CAT G AC AG C T 18 0 

II I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 

Qy 181 ACCT GG GT C GGAGGAGGGT AT ATCAAT GGC ACAG CT GAAGCAGTT T AT GTAC CAG GTT AT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I 

Db 181 ACCT GGGTTGGAGGAGGCT AC ATCAAT GGGACAGCAGAAGCAGT GTAT GGGC CAGGTT GT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 241 GGTCTAGCTTGGGCTCATGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I M 

Db 301 TTTTTTGC G AAAC C TAT G C GT T C C AAG G GAT AT GT GAC TAT GT T AGAC C CAT T CAAAC AG 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 42 0 

II I I I I I I I I I I II I II I I II II II II II I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 420 

Qy 421 T GGGCT G CAGCAAT T TT CT CT GCT T T GGGAG CCACCAT CAG C GT GAT CAT C GAT GT G GAT 4 80 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 T GGG CT GCAGCAAT TTT CT CT GCAT TAG GGGCCAC CAT CAGC GT GAT CATT GAT GTGGAT 480 

Qy 481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 4 81 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I I I I I I II I I I I I II I I I I I I I I I II I I I I I I I I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 601 AT CAGT GTCCCTTTTGCCCTGT CACAT C CT GCAGT CAC CGACAT C GGAT T C ACAGCT GT G 660 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I III I I I I I I I I I I I Ml 

Db 661 CAT GCT AAATAC C AGAGT CC CTGGCT G GGAAC CAT T GAAT CAGT T GAAGT CT AC ACCT GG 720 

Qy 721 CT T GAT AGT TTTCTGTTGTT GAT G CT GGGT GGAAT C C CAT G GC AAGC AT ACT T T C AG AGG 7 80 

I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 721 CT T GAT AAT T TT CT GTT ATT GAT GCT GGGT G GAAT CC CAT GGCAAGC CTACT T CCAGAGG 780 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

II I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I Ml 



7 81 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 840 

841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

841 TGCCTGGTGATGGCTCTACCCGCCATATGCATAGGAGCTATTGGAGCTTCCACAGACTGG 900 

901 AAC C AGAC T GC AT AT GGGCT T C C AGAT C C C AAGACT AC AGAAG AGGC AGAC AT GAT T T T A 960 

I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

901 AACC AGACT GC CT AC GGGT AT C C AGAT CC CAAGACT AAG GAGGAAG CAGACAT GAT T CT C 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II || I I I I I I I I I I I I I I I I I Ml M M I I I I I I II I I I I I I I I III 

961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I I I I I I I I I I I I I I I I I I M M I I I I! I I I I I I MINIM 

1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

1081 C GGAACAT CTAC C AGCTT T CCT T C AGACAAAAT G CT T C GGACAAAGAAAT CGT TT G GGT T 1140 

I I I I I I M II II I I I I II II M II II M II M II II II II I I I M I M Mill 
1081 C GGAATAT CTAC CAGCTT T C CT T C AGACAAAATGC AT C AGACAAGGAAATT GT GT GGGT C 1140 

1141 AT GC GAAT C AC AGT GT TT GT GT TT GGAGCAT CT GCAAC AGC CAT G G C CT T GCT GACGAAA 1200 

Ml | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 1 I I I I I I I 
1141 AT GAGGAT CAC T GT GCT T GT GT T CGGAGCAT CT GCAAC AGC CAT GGCTT T GCT GAC GAAG 1200 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I M I I I I II I II I I II I I I I I I M II I I I I II II I I I I I II I II II II II III 

12 01 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1260 

12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I M II II I II II I I I M I I M II II II I II I M M I II II I I I I I I M 

1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

1321 TCTGGCCTCTTCCT GAGAAT AACT GGAGG GGAGC C AT AT CT GT AT CT T CAGC C CT T GAT C 1380 

I Ml II I II II I I I I I I II II II I I I II M I I M II II I II I M I II I M 
1321 TTT GGACTATT CCT GAGAATTACT GGAGGAGAGCCAT ATCT ATACTT GCAGC CCTTAAT C 1380 

1381 TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAA 1440 

I I I I II I II I I M I II II II I I II I I II I II II II II I II 

1381 T T C T AC C CT GGTT AT TACT CT GAC AAGAAT GGT AT AT ACAAT CAGAGGTT C C CAT TT AAA 1440 

1441 AC ACT T GC C AT G GT T AC AT CAT T CT T AAC CAAC AT T T GC AT CT C CT AT CT AG C C AAGT AT 1500 

M II I I II II M I I II I II II I II II II I II II I II I M II I I II II II II 
1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

1501 CT AT T T GAAAGT GGAAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT G CT GT T GT T GC AAGA 1560 

MM MINI M I II II II II I II II II M II I I M M 

1501 CT AT TT GAAAGT GGAAC CT T GCCT C CAAAAT T AGAT GT AT T TGAT GCT GT T GT C GCAAGG 1560 

1561 CAC AGT GAAG AAAAC AT G G AT AAG AC AAT T C T T GT C AAAAAT G AAAAT AT T AAAT T AGAT 1620 

I || I I || I I II I M II I II I II I I II I M I II I II II II II II I I II II II II 
1561 CACAGT GAAGAGAAC AT GGACAAGAC CAT T CT AGT C AGAAAT GAAAAT AT CAAAT T AAAT 1620 

1621 G AACT T GCACT T GT GAAGC CAC GACAGAGC AT GAC C CT C AG CT C AACT T T CAC C AAT AAA 1680 

I I II II I I M MUM II II II II II I I M II II I I II II II II II II I M I 
1621 GAACTT GC AC CT GT GAAAC CTC G GC AGAG C CT AAC C CT C AGTT CAACT T T CAC CAAT AAG 168 0 



Qy 



Db 



1681 



1681 



GAG GCCT T C CTT GAT GT T GAT T C CAGT C CAGAAG G GT CT GG GACT GAAGAT AAT T T AC AG 1740 
I I I I I I I I I I I I I I I I II I I I I I I I I i I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAATTTACAA 1740 



Qy 



1741 



TGA 1743 



Db 



1741 



TGA 1743 



RESULT 10 
AAF81713 

ID AAF81713 standard; cDNA; 1743 BP. 
XX 

AC AAF81713; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Mouse high affinity choline transporter protein encoding cDNA. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis; 

KW ss . 

XX 

OS Mus mus cuius. 
XX 

FH Key Location/Qualifiers 
FT CDS 1. .1743 

FT /*tag= a 

FT /product= "high affinity choline transporter" 

XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2 000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-0024 0642 . 
PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 
DR P-PSDB; AAB74666. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 
PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 
PT treating Alzheimer's disease. 
XX 

PS Claim 12; Page 78-82; 90pp; Japanese. 
XX 

CC The present sequence encodes a mouse (Mus musculus) high affinity choline 
CC transporter protein designated cho-1. The cho-1 protein has nootropic and 
CC neuroprotective activities. The cho-1 polynucleotide and protein can be 
CC used for the diagnosis of diseases related to the expression of cho-1 by 
CC comparing the cho-1 polynucleotide sequence in a sample to that of a 



CC control. Drug compositions containing the cho-1 protein or expression 

CC promoters or inhibitors of cho-1 are useful for treating disorders 

CC characterised by abnormal levels of cho-1, such as Alzheimer f s disease 
XX 

SQ Sequence 1743 BP; 407 A; 410 C; 409 G; 517 T; 0 U; 0 Other; 

Query Match 78.8%; Score 1373.4; DB 4; Length 1743; 

Best Local Similarity 86.7%; Pred. No. 0; 

Matches 1512; Conservative 0; Mismatches 231; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

III I I I I I I I II I I I I I II I I I I I I I I I III I I I I I I II I II II II Ml 

Db 1 AT GT CT T T C C AC GT AG AAG GAC T GGT AGCT AT TAT CCT CT T CT AC CT C C T TAT AT T T CT G 60 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I I I I I I I I I I I I I I I I I I I I MINIMUM I I II I I I I I I I I I I III III 

Db 61 GT T G G AAT AT G G G C T G CAT G G AAAAC C AAAAAC AG C G G C AAC C C AG AAG AG C AC AGT G AA 120 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 18 0 

I II I I I I II I I II I I I I I II I I II II II I I I I I I I I II I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 180 

Qy 181 AC CT GG GT C GGAG GAGG GT AT AT CAAT GGCACAG CTGAAGCAGT T TAT GT ACCAGGT TAT 24 0 

I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 181 ACCT GGGTT GGAGGAGGCTACATCAAT GGGACAGCAGAAGCAGT GTAT GGGCCAGGTT GT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I Ml 

Db 241 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

Qy 301 T T CT T T G CAAAAC CT AT GC GT T C AAAG GGGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

II M II I I I I I I I II I I I I I I I I I M I I I II I I I I I I I I I I I I I I I II I I II 
Db 301 TTTTTT GCGAAAC CTAT GCGTTCCAAGGGATAT GTGACTAT GTTAGACCCATTT CAACAG 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

I I I II M I II I I I I I I I I I II II II I I I I' I I I I II I I I I I I I I I I I I I I I II 

Db 361 ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 42 0 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

I I II I I I II II I I I II I I I II M II II I I I I I I II I II I I I I I I I I I I I I I I I I II 

Db 421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 4 80 

Qy 4 81 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

II I I I I II I I I I I I I I I I II I M I I I I I I II III M II II I I I I I III 

Db 481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Mill I I I I I I I I M I I I II I I I II I I I I I II II I I I I I II I I I I I I I II I 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I II I I I I I I I I II I I I M I I I I II II I I I M I I II II II I I I I I I I I I I II 

Db 601 AT CAGT GTCCCTTTTGCCCTGT C ACAT C CT GC AGT CAC C GAC AT CG GAT T CAC AGCT GT G 660 



Qy 

Db 



661 
661 



CAT GCC AAAT ACCAAAAGCC GT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CT T GG 

I I II I I I I I II I I I II II II I I M I I I MM Ml I I I II I I I I I I Ml 

CATGCTAAATACCAGAGT C CCT GGCT GGGAAC CATT GAAT CAGTTGAAGT CTACACCTGG 



720 
720 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 

781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 

II I I I I I I I I I I I I II I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I III 

781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 



841 



841 



901 



901 



961 



961 



T GCCT GGTGAT GGC CATC CCAGC CAT ACT C ATT GGGGCCATTGGAGCAT CAACAGACTGG 

I I I I I I I I I I I I I I I II I I I I I I III II II I I I I I I I I II I I I I I I I I I 

T GCCT GGTGAT GGCTCTACCCGCCAT AT GCATAGGAGCTATTGGAGCTTCCACAGACTGG 



780 



780 



840 



840 



900 



900 



AACCAGACT GCATAT GGGCTT C CAGATCCCAAGACTACAGAAGAGGCAGACAT GATTTT A 960 

I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

AACCAGACT GCCTACGGGT AT CCAGATCCCAAGACTAAGGAGGAAGCAGACAT GATTCT C 960 

CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I II I I I II III 

CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 



1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I II I I I I I I I I I I I I I 

1021 "TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

1081 C GGAAC AT CT AC CAGC T T T C CT T CAGAC AAAAT G CT T C GGAC AAAGAAAT C GT TT GGGT T 1140 

I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I II I I I I I 
1081 C GGAAT AT CT ACCAGCTT T C CT T CAGAC AAAAT GC AT CAGAC AAGGAAAT T GT GT GGGT C 1140 

1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

III I I I I I I III I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1141 AT GAGGAT CACT GT GCTT GT GT T C G GAG CAT CT G CAACAGC CAT GGCTTTGCT GAC GAAG 12 00 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III 

12 01 ACT GT GT AT GGG CT CT GGT AC CT GAGCT CT GACCTT GT CT AC AT CAT C AT CT T C C C AC AG 1260 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II II I I I I I I II 

1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III II I I I I II I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I Ml 

1321 TTTGGACTATTCGTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1380 

1381 T T CT ACC CT GGCT AT T AC CCT GAT GAT AAT GGT AT AT AT AAT CAGAAAT TT C C AT TTAAA 14 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
1381 T T CT ACC CT GGTT AT T ACT CT GACAAGAAT G GTAT AT ACAAT CAGAGGT T C C CAT T T AAA 1440 

1441 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I 
1441 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1500 

1501 CT AT T T GAAAGT GGAAC CT T GC C AC CTAAAT TAG AT GTAT T T GAT GCT GT T GT T GC AAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I II II I I I I I I I I I I I I I I I I I I 
1501 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATGTATTTGATGCTGTTGTCGCAAGG 1560 



Qy 1561 CACAGTGAAGAAAACAT GGATAAGACAATTCTT GT CAAAAAT GAAAATATTAAATTAGAT 1620 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I MINI M 

Db 1561 CACAGT GAAGAGAACAT GGACAAGACC ATT C T AGT CAGAAAT GAAAAT AT C AAAT T AAAT 1620 

Qy 1621 GAACTT GCACTT GT GAAGC C ACGACAGAGCAT GACCCTCAGCT CAACTTT CACCAATAAA 1680 

I I I I I I I I I I I I I I I I II II I I I I I I I MINIM I I I I I I II I I I I I I I I I 
Db 1621 GAACT T GCAC CT GT GAAACCT C GG C AGAGC CT AAC CCT CAGT TCAACT T T CACCAATAAG 1680 

Qy 1681 GAG GC CT T C CT T GAT GTT GATT CCAGT CCAGAAGGGT CT GG GACT GAAGATAAT T TAC AG 174 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I 

Db 1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAACTTACAA 174 0 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



AAD02457; 

24-APR-2001 (first entry) 

Mouse P4P6B1 OMA (obese mice adipocyte) protein encoding cDNA. 
Mouse; OMA protein; obese mice adipocyte; P4P6B1; 

fuel metabolism disorder; therapy; obesity; diabetes; gene therapy; 
anorectic; antidiabetic; ss. 



RESULT 11 
AAD02457 

ID AAD02457 standard; cDNA; 4938 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 



Mus sp. 

Key 
CDS 



misc feature 



Location/Qualifiers 
247. .1989 
/*tag= a 

/product= "Mouse P4P6B1 OMA protein" 
988. .1342 
/*tag= b 

/note= "Portion of original 450 bp PCR fragment" 

WO200078950-A2. 
28-DEC-2000. 

13-JUN-2000; 2000WO-US016217 . 

22-JUN-1999; 99US-014 1515P . 

(AMYL-) AMYLIN PHARM INC. 

Sierzega M, Albrandt K; 

WPI; 2001-112322/12. 
P-PSDB; AAY72388. 

Novel obese mice adipocyte polypeptides useful in diagnosis and treatment 



PT of disorders of fuel metabolism such as obesity or diabetes. 
XX 

PS Claim 2; Fig 3; 83pp; English. 
XX 

CC The present sequence is mouse P4P6B1 cDNA which encodes OMA (obese mice 

CC adipocyte) protein. The P4P6B1 fragment was generated by RNA 

CC fingerprinting using random primers P4 and P6. OMA is used as a 

CC diagnostic reagent for diagnosing a disorder of fuel metabolism in an 

CC underweight or an overweight individual , by detecting the transcription 

CC level of a gene encoding OMA, which is induced or repressed in an 

CC individual by a factor such as genetic obesity, fasting and refeeding of 

CC a fasted individual. OMA is useful in the generation of antibodies, for 

CC use in pharmaceutical compositions and for studying DNA/protein 

CC interactions. Nucleic acids encoding OMA are involved in gene therapy. An 

CC inhibitor of OMA or an antisense oligonucleotide that inhibits expression 

CC of OMA are useful for treating disorders of fuel metabolism such as 

CC obesity or diabetes 

XX 

SQ Sequence 4938 BP; 1436 A; 1012 C; 976 G; 1514 T; 0 U; 0 Other; 



Query Match 78.8%; Score 1373.4; DB 5; Length 4 938; 

Best Local Similarity 86.7%; Pred. No. 0; 

Matches 1512; Conservative 0; Mismatches 231; Indels 0; Gaps 0; 



Ov 


i 


ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


60 




III MINI! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 II II 1 Ml 




Db 


247 


ATGTCTTTCCACGTAGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATACTTCTG 


306 


Qy 


61 


GT T GGAATATGGGCT GCCT GGAGAAC CAAAAACAGT GGCAGC GCAGAAGAGC GCAG C GAA 


120 




I I I I I 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 I 1 




Db 


307 


GT T GGAAT AT GG GCT G CAT GGAAAAC CAAAAACAGC G GCAAC C CAGAAGAGCGCAGT GAA 


366 


Qy 


121 


GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 


180 




I I I I I 1 M II 1 II 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


367 


GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 


426 


Qy 


181 


ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 


240 




I I I I I I I I II 1 1 II II II 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


427 


AC CT GGGT T GGAGGAGGCT ACAT CAAT G GGAC AGC AGAAGCAGT GT AT GGGC CAG GT T GT 


486 


Qy 


241 


GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 


300 




II 1 I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M III 




Db 


487 


GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 


546 


Qy 


301 


TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 


360 




II 1 1 II 1 1 1 1 1 1 1 1 1 1 Mill II 1 1 1 1 1 1 1 1 1 1 1 1 Mill II 




Db 


547 


TTTTTTGC GAAAC C T AT G C GT T C C AAGG G AT AT GT GAC T AT GT T AGAC C CAT T T C AAC AG 


606 


Qy 


361 


ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 


420 




MM 1 1 M II II II II II II II II II 1 1 1 1 1 1 1 




Db 


607 


ATCTATGGAAAGCGCATGGGTGGGCTGCTCTTCATCCCTGCACTGATGGGAGAGATGTTC 


666 


Qy 


421 


T GGG CT G C AGCAATT TT CT CT GCTT T GGGAGC CAC CAT CAGC GT GAT CAT C GAT GT GGAT 


480 




1 1 1 1 M 1 II 1 II II 1 II II 1 1 M II II 1 II II 1 II II II II 1 1 1 II 1 M 1 II 1 II 1 




Db 


667 


T GGGCT GC AGCAAT T TT CT CT GCAT TAG GGG C CAC CAT CAGC GT GAT C ATT GAT GT GGAT 


726 


Qy 


481 


ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 


540 



II I I I I II Mill I I I I I I I I I I I I I I I I I I III II II II I I I I I Ml 

Db 727 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 786 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I II I I I II I I I II I I I I I II I I I I I I I I I I I M I I I I I I 

Db 787 CTCTACT CT GT GGCATAT ACTGAT GTTGT CCAGCTATT CTGCATTT TTATAGGACT GT GG 84 6 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 847 AT C AG TGTCCCTTTTGCCCTGT C AC AT C C T G C AGT C AC C GACAT C GGAT T C AC AG C T GT G 906 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I I I I I I I I I I I I II I I II I I I I I I I I I I I Ml I I I II I II II I III 
Db 907 CAT GCT AAAT AC C AGAGT CC CT GGCT GGGAAC CATT GAAT C AGT T GAAGT CT ACACCT GG 966 

Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 78 0 

I I I I I M I II I I I II I I II I II I II I I I I I I M II I I II II I I I I Mill MUM 
Db 967 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGCAAGCCTACTTCCAGAGG 1026 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

M I I I I I II I I I II I II I I I I I I I I I I II I II II M I M I I I I I I M II I I I III 

Db 1027 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 1086 

Qy 841 T G C CT GGT GAT G GC CAT C C C AGC C AT ACT C AT T G GGGC CAT T GGAGC AT CAAC AGACT GG 900 

I I I I I I I I I I I I I I I II I II I II III II II II II I II I II I II I I II M 

Db 1087 T GC CTGGT GAT GGCTCT ACCCGCCATAT GCAT AGGAGCTATT GGAGCTTCCACAGACT GG 1146 

Qy 901 AACCAGACT GCAT AT GGGCTT CCAGATCCCAAGACT ACAGAAGAGGCAGACAT GATTT TA 960 

I I II I M II II II II I I II I I II I I I II I II II I I I I I I I I II II I I II I 

Db 1147 AAC CAGACT GC CT AC GGGT AT CCAGAT C C CAAGACT AAGGAGGAAGC AGAC AT GATT CT C 12 06 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II M I I II I I I I M I II I I I I II I II M I II II I I M I II I I M I II I I III 

Db 1207 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 12 66 

Qy 1021 T CT GCT GCT GT TAT GT CAT CAGC AGAT T CT T CCAT CT T GT CAGCAAGT T C CAT GTTT GCA 108 0 

I I II I I I I I I Mill Mill II II I I M II I I I I II Mill II II II II 

Db 1267 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1326 

Qy 1081 C G GAAC AT CT AC C AG CT T T C CT T C AGACAAAAT G CT T C G G AC AAAGAAAT C GT TT GGGT T 114 0 

Mill I I I I I I I II I II II I II II I I I I I II II I II II II I I I I II II II I II 
Db 1327 CGGAATATCT ACCAGCT TTCCTT C AGACAAAAT GCAT CAGACAAGGAAATTGTGT GGGTC 138 6 

Qy 1141 AT G C GAAT C AC AGT GTTTGTGTTTG GAG CAT C T G CAAC AG C CAT GGCCTTGCT G AC G AAA 1200 

I I I I I I I I I III I I II I II I I II II I I II II I I II I II II M II I I II I II II 

Db 1387 ATGAGGATCACTGTGCTTGTGTTCGGAGCATCTGCAACAGCCATGGCTTTGCTGACGAAG 14 4 6 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I II I II I I II II I I I II II I II II I II II I I II I I I I I II M I M Ml 

Db 1447 ACT GTGTAT GGGCTCT GGT ACCT GAGCTCT GACCTTGT CT ACAT CAT CAT CTT CCCACAG 1506 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

II I II I I I II I I II II I II II I I I I II I M I II II I M II II II I M I M 

Db 1507 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1566 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I II I I I I I I I I I M I I I I M II M I M I II I I II II II I I II II I II II I 



Db 



1567 T TT GGACTAT T CCT GAGAAT TACT GGAGGAGAGC C AT AT CT AT ACTT G CAGC CCT T AAT C 1626 



Qy 1381 T T CTAC C CT GGCT ATT ACCCT GAT GAT AAT G GTAT AT AT AAT C AGAAATT T C CAT T T AAA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 1627 T T CT AC C CT GGTT AT T ACT CT GACAAGAAT GGTAT AT ACAAT C AGAGGTT C C CAT TT AAA 168 6 

Qy 1441 AC ACT T G C CAT G GT T AC AT CAT T CT T AAC C AACAT T T GC AT CT C CT AT CT AG C C AAGT AT 1500 

II || I II I I I I I I 1 I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1687 ACTCTCTCCATGGTTACCTCATTCTTTACCAACATTTGTGTTTCTTATCTAGCCAAGTAT 1746 

Qy 1501 CTATTT GAAAGT GGAACCTT GCCACCTAAATTAGATGTATTTGATGCT GTT GTT GCAAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1747 CTATTT GAAAGT GGAAC CTT GCCT CCAAAAT T AGAT GTAT T T GAT GCT GT T GT C GCAAGG 1806 

Qy 1561 C AC AGT GAAGAAAACAT GGAT AAG AC AAT T C T T G T C AAAAAT GAAAAT AT T AAAT TAG AT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II MINI II 
Db 1807 C AC AGT GAAGAGAAC AT GGAC AAGAC CAT T CT AGT C AGAAAT GAAAAT AT C AAAT T AAAT 18 66 

Qy 1621 GAACTT GCACTT GT GAAGCC ACGACAGAGCAT GAC C CT CAGCT CAACTTT CACCAAT AAA 1680 

I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 1867 GAACTTGCACCTGTGAAACCTCGGCAGAGCCT7\ACCCTCAGTTCAACTTTCACCAATAAG 1926 

Qy 1681 GAGGCCT T C CTT GAT GT T GATT C C AGT CCAGAAGGGT CT GGGACT GAAGATAAT T T ACAG 1740 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1927 GAGGCC CT C CTT GAT GT T GAT T C CAGT C C GGAGG GGT CT GGGACT GAAGATAACT TACAA 198 6 

Qy 1741 TGA 1743 

III 

Db 1987 TGA 1989 



RESULT 12 
AAH49206 



ID AAH49206 standard; DNA; 8760 BP. 
XX 

AC AAH49206; 
XX 

DT 26-NOV-2001 (first entry) 
XX 

DE Human CHOT exons 6, 7, 8 and 3' UTR region DNA. 
XX 

KW CHOT; human; choline transporter; chromosome 2qll-13; nootropic; 

KW neuroprotective; gene therapy; antisense therapy; degenerative disease; 

KW cognitive disorder; Alzheimer's disease; ds . 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .4 853 

FT /*tag= a 

FT /product= "CHOT" 

FT /note= "This sequence is interrupted by introns" 

FT exon 41. .194 

FT /*tag= b 

FT /number= 6 

FT intron 195. .2456 

FT /*tag= c 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
PA 
XX 
PI 
XX 
DR 
XX 
PT 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



exon 



intron 



exon 



3'UTR 



/ number = 6 
2455. .2674 
/*tag= d 
/number= 7 
2675. .4223 
/*tag= e 
/ number = 7 
4224. .4853 
/*tag= f 
/ number = 8 
4854. .8760 
/*tag= g 



DE10009055-A1. 
30-AUG-2001. 

28-FEB-2000; 2000DE-01009055 . 

28-FEB-2000; 2000DE-01009055 . 

(BRUE/) BRUESS M. 
(BOEN/) BOENISCH H. 

Bruess M, Boenisch H; 

WPI; 2001-590709/67. 

A new gene encoding human choline transporter, designated hCHOT is 
located on chromosome 2qll-13 and is useful to treat degenerative 
disorders such as Alzheimer's disease. 

Disclosure; Page 9-11; 12pp; German. 

This invention describes a novel gene encoding human choline transporter, 
designated hCHOT which is located on chromosome 2qll-13. The products of 
the invention have nootropic and neuroprotective activity and can be used 
for gene or antisense therapy. (I) is used to treat degenerative disease, 
particularly cognitive disorders such as Alzheimer f s disease. Sense and 
antisense oligonucleotides derived from the gene may be used in 
diagnostics and other techniques. This sequence represents exons 6-8 and 
the 3 1 UTR fragment encoding the human CHOT protein described in the 
invention 

Sequence 8760 BP; 2727 A; 1619 C; 1565 G; 2849 T; 0 U; 0 Other; 



Query Match 36.2%; 
Best Local Similarity 99.7%; 
Matches 632; Conservative' 



Score 630.8; DB 5; 
Pred. No. 1.8e-172; 
0; Mismatches 2; 



Length 8760; 



Indels 



0; Gaps 



0; 



Qy 1110 AAAT GCTT C GGACAAAGAAAT CGTTT GGGTT AT GC GAAT CACAGT GTTT GT GTTT GGAGC 1169 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 422 0 ACAGGCTT C GGACAAAGAAAT CGTTT GGGTT AT GCGAAT CACAGT GTTT GT GTTT GGAGC 4279 

Qy 117 0 ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 428 0 ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 4339 



Qy 

Db 



1230 
4340 



1289 
4399 



Qy 1290 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 1349 

I I I I I I I I I I I I I I I I I I I II I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4400 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 4459 

Qy 1350 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 14 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 44 60 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 4519 

Qy 1410 T G GT AT AT AT AAT C AG AAAT T T C CAT T T AAAAC AC T T G C CAT G G T T AC AT CAT T C T T AAC 14 69 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4520 T G GT AT AT AT AAT CAGAAATTT C CATTT AAAACACT T GC CAT GGT T ACAT CAT T CT T AAC 4 57 9 

Qy 1470 C AAC AT T T G CAT C T C C TAT C TAG C C AAG TAT C T AT T T GAAAGT G G AAC C T T G C C AC C T AA 1529 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 4580 C AAC AT TT GC AT CT C C TAT CT AGC CAAGT AT CT AT T T GAAAGT G GAAC CT T GC C AC CT AA 4 639 

Qy 1530 AT TAG AT G T AT T T GAT GCT GT T GT T G C AAGAC AC AGT GAAGAAAAC AT G GAT AAGAC AAT 1589 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 464 0 AT TAGAT GT AT T T GAT GCT GTT GT T G CAAGACACAGT GAAGAAAAC AT G GAT AAGACAAT 4 699 

Qy 1590 T C T T GT CAAAAAT GAAAAT AT T AAAT TAGAT GAAC T T G C AC T T GT G AAG C C AC G AC AGAG 1649 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4700 T CTT GT CAAAAAT GAAAAT ATTAAAT TAGAT GAACT T G CACT T GT GAAGC CAC GAC AGAG 4759 

Qy 1650 CAT GAC C CT CAGCT C AACT T T CAC C AAT AAAGAGGC CT T C CTT GAT GT T GAT T C C AGT C C 17 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 47 60 CATGACCCTCAGCTCAACTTTCACCAATAAAGAGGCCTTCCTTGATGTTGATTCCAGTCC 4 819 

Qy 1710 AGAAGGGT CT GGGACT GAAGAT AAT T T AC AGT GA 1743 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 482 0 AGAAGGGT CTGGGACT GAAGAT AAT TT AC AGT GA 4853 



RESULT 13 
ADD50656 

ID ADD50656 standard; DNA; 119040 BP. 
XX 

AC ADD50656; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE BAC sequence containing hCHT gene. 
XX 

KW Human; high-affinity choline transporter; hCHT; chromosome 2ql2; 

KW cholinergic function; Parkinson's disease; Huntington's disease; 

KW Alzheimer f s disease; schizophrenia; dysautonomia; myasthenia gravis; 

KW brain; cholinergic signalling; antiparkinsonian; anticonvulsant; 

KW nootropic; neuroprotective; neuroleptic; bacterial artificial chromosome; 

KW BAC; ds. 

XX 

OS Homo sapiens. 
XX 



PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Example 3; SEQ ID NO 19; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents a bacterial artificial 

CC chromosome (BAC) sequence containing the hCHT gene. Note: The sequence 

CC data for this patent was obtained in electronic format directly from the 

CC USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 119040 BP; 37072 A; 22876 C; 21708 G; 36882 T; 0 U; 502 Other; 



Query 


Match 


36.2%; Score 630.8; DB 9; Length 119040; 




Best 


Local Similarity 99.7%; Pred. No. 7.1e-172; 




Matches 632; Conservative 0; Mismatches 2; Indels 0; Gaps 


0; 


Qy 


1110 


AAATGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 


1169 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 




Db 


30755 


ACAGGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 


30814 


Qy 


1170 


ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 


1229 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


30815 


ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 


30874 


Qy 


1230 


TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 


1289 






1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 i II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 




Db 


30875 


TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 


30934 



Qy 12 90 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 134 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 30935 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 30994 

Qy 1350 GGAGC CAT AT CT GT AT CT T CAGCCC TTGAT CTT CT AC C CT GG CT AT T AC C CT GAT GAT AA 14 09 

1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 

Db 30995 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 31054 

Qy 1410 T G GT AT AT AT AAT C AGAAAT T T C CAT T T AAAAC ACT T G C CAT GGT T ACAT CAT T CT T AAC 14 69 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 31055 T GGT AT AT AT AAT CAGAAATT T CCAT TT AAAACACT T GC CAT G GT TAC AT CAT TCT T AAC 31114 

Qy 1470 CAACATTT GCAT CT CCT AT CTAGCCAAGT AT CTATT T GAAAGT GGAACCTTGCCACCTAA 1529 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 31115 CAAC ATT T GCAT CT C CT AT CT AGC CAAGT AT CTAT T T GAAAGT GGAAC CTT GC CAC CT AA 31174 

Qy 1530 AT T AGAT GTAT T T GAT GCT GT T GTT GCAAGACACAGT GAAGAAAACAT G GATAAGACAAT 1589 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I II I I I I I I II I I I I I I I I 
Db 31175 AT T AGAT GTAT T T GAT GCT GT T GTT GCAAGACACAGT GAAGAAAACAT GGAT AAGACAAT 31234 

Qy 15 90 T CTT GT CAAAAAT GAAAAT AT T AAAT T AGAT GAACT T GCACT T GT GAAGC CAC GAC AGAG 1649 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 31235 T CTT GT CAAAAAT GAAAAT AT TAAAT T AGAT GAACT T G CACT T GT GAAG CC AC GAC AGAG 31294 

Qy 1650 CAT GAC CCT CAGCT C AACT T T CAC CAAT AAAGAG GC CT T C CT T GAT GT T GAT T C C AGT C C 17 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 312 95 CAT GAC CCT C AG CT C AACT T T CAC C AAT AAAGAGG C CT T C CT T GAT GT T GAT T CC AGT C C 31354 

Qy 1710 AGAAG G GT CT G GGACT GAAG AT AAT T TAC AGT GA 174 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 31355 AGAAGGGT CT GGGACT GAAGAT AATTTACAGT GA 31388 



RESULT 14 


ADD50651 


ID 


ADD50651 standard; DNA; 142299 BP. 


XX 




AC 


ADD50651; 


XX 




DT 


15-JAN-2004 (first entry) 


XX 




DE 


BAC sequence #2 containing hCHT DNA. 


XX 




KW 


Human; high-affinity choline transporter; hCHT; chromosome 2ql2; 


KW 


cholinergic function; Parkinson's disease; Huntington's disease; 


KW 


Alzheimer ! s disease; schizophrenia; dysautonomia; myasthenia gravis; 


KW 


brain; cholinergic signalling; antiparkinsonian; anticonvulsant; 


KW 


nootropic; neuroprotective; neuroleptic; bacterial artificial chromosome; 


KW 


BAC; ds. 


XX 




OS 


Homo sapiens. 


XX 




PN 


US2003114399-A1. 


XX 




PD 


19-JUN-2003. 


XX 





PF 23-JUL-2001; 2 001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PAR SUN DARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Example 1; SEQ ID NO 14; 7 4pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents a bacterial artificial 

CC chromosome (BAC) sequence containing hCHT DNA. Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 142299 BP; 42714 A; 27041 C; 26747 G; 44494 T; 0 U; 1303 Other; 

Query Match 36.2%; Score 630.8; DB 9; Length 142299; 

Best Local Similarity 99.7%; Pred. No. 7.8e-172; 

Matches 632; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

AAATGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 1169 

I | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

ACAGGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 9473; 

ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M M I I 

ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 9479; 

TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 1289 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 9485; 

CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 134 9 
| || | | | | | | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Qy 


1110 


Db 


94673 


Qy 


1170 


Db 


94733 


Qy 


1230 


Db 


94793 


Qy 


1290 


Db 


94853 



Qy 


1350 


GG AGC CAT AT CT GT AT CTT CAGC C CT T GAT CTT C T AC C CT G GCT AT TAC C CT GAT GATAA 


1409 




1 1 1 1 ! 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 I 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 II 1 i 1 II 1 1 M 1 1 1 1 1 1 1 1 M 1 1 II 1 II 1 1 1 1 II II M 1 1 M M M 1 1 1 II 1 M 1 1 1 t 1 




Db 


94913 


GGAGCCATAT CT GTAT CTT CAGCCCTT GAT CTT CTACCCT GGCTATTACC CTGAT GATAA 


94972 


Qy 


1410 


T GGT AT AT AT AAT C AGAAATT T C CAT T T AAAAC ACT T G C CAT G GT T AC AT CAT T CT T AAC 


1469 




i i i i i i i r i i i i i « i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i I i I I I I I I I 
II II 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


94973 


T GGT AT AT AT AAT CAGAAAT T T C CAT T TAAAACACT T GC CAT GGTT ACAT CAT T CT T AAC 


95032 


Qy 


1470 


CAACATTTGCATCTCCTATCTAGCCAAGTATCTATTTGAAAGTGGAACCTTGCCACCTAA 


1529 




i i i i i i i i i i i i i i > i i i i i i i i i i i i i i i i i i i i i t i i t i i i i i i i i i i i i i i t i i i i i 
I | | | 1 II 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


95033 


C AAC AT T T G CAT C T C C TAT CT AGC C AAGT AT CT AT T T GAAAGT GGAAC C T T GC C AC CT AA 


95092 


Qy 


1530 


AT T AGAT GT ATT T GAT GCT GT T GT TGCAAGACACAGT GAAGAAAAC AT GGAT AAGACAAT 


1589 




■ i i i i i i i i i i i i i i i i i i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
I | M | | 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 




Db 


95093 


AT T AGAT G T AT T T GAT GCTGTTGTTG C AAGAC AC AGT GAAGAAAAC AT G GAT AAGACAAT 


95152 


Qy 


1590 


T CTT GT CAAAAAT GAAAAT ATT AAAT T AGAT GAACT T GC ACTT GT GAAGC CAC GACAGAG 


1649 




■ i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i < i i i i i i i i i i i i i i i i 
1 I I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


95153 


T CT T GT CAAAAAT GAAAAT ATT AAAT T AGAT GAACT T G CACT T GT GAAG C CAC GACAGAG 


95212 


Qy 


1650 


CAT G AC C C T C AG C T C AAC T T T CAC C AAT AAAG AG GCCTTCCTT GAT GT T GAT T C C AGT C C 


1709 




I | | | | | | I I I 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


'95213 


CATGACCCTCAGCTCAACTTTCACCAATAAAGAGGCCTTCCTTGATGTTGATTCCAGTCC 


95272 


Qy 


1710 


AGAAGGGT C T G G GACT GAAGATAAT T TAC AGT GA 1743 






1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


95273 


AGAAG GGT CT GGGACT GAAGATAAT TT AC AGT GA 95306 





RESULT 15 
AAF81710 

ID AAF81710 standard; cDNA; 1731 BP. 
XX 

AC AAF81710; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE C. elegans high affinity choline transporter protein encoding cDNA. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis; 

KW ss. 

XX 

OS Caenorhabditis elegans. 
XX 

FH Key Location/Qualifiers 
FT CDS 1. .1731 

FT /*tag= a 

FT /product= "high affinity choline transporter" 

XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 



PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 



DR P-PSDB; AAB74663. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer f s disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 3; Page 57-62; 90pp; Japanese. 
XX 

CC The present sequence encodes a Caenorhabditis elegans high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1 , such as 

CC Alzheimer f s disease 
XX 

SQ Sequence 1731 BP; 428 A; 373 C; 427 G; 503 T; 0 U; 0 Other; 



Query Match 20.9%; Score 363.8; DB 4; Length 1731; 

Best Local Similarity 55.1%; Pred. No. 5.5e-95; 

Matches 862; Conservative 0; Mismatches 637; Indels 66; Gaps 5; 

Qy 19 GGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTGGTTGGAATATGGGCTGCC 78 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 16 GGTATCGTGGCCATTGTGTTCTTCTACGTGCTCATTCTTGTCGTTGGAATATGGGCGGGT 75 

Qy 79 T GGAGAAC CAAAA ACAGT G GC AGC GCAGAAGAGCG C AG C GAAGC CAT C 126 

I I I I I I I I I I I I I I I I I I I 

Db 76 AGAAAAT CGAAAAGTT CAAAAGAGCT T GAAT C AGAAGC C GG CGC GGC GAC G GAAGAGGT G 135 

Qy 127 ATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCTACCTGG 186 

II I I I I I II I II II I I I I I II I I I I I I I I I I I I I I 

Db 136 AT GTT AGCT GGGAGAAACAT CGGAACT CTT GT C G GAAT TT T CACAAT GAC T GC CAC GT GG 195 

Qy 187 GTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTA 24 6 

II II II I I I I I I I I I I I I I I I I I I I I I II I I I I I II 
Db 196 GTTGGCGGTGCTTATATCAATGGAACCGCCGAGGCTCTGTATAATGGAGGT CTC 24 9 

Qy 247 GCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTT 306 

II I I I I I I I I I I I I II I I I I II II I I I I I I I I I III 

Db 2 50 CTTGGATGTCAGGCTCCAGTTGGATATGCAATTTCCCTTGTTATGGGAGGACTACTTTTC 309 

Qy 307 GCAAAAC CT AT GC GT T C AAAGGGGT AT GT GACC AT GT TAG AC C C GT T T C AGC AAAT CT AT 366 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I III 

Db 310 GCAAAGAAAAT GC GAGAAGAAGGAT AT ATT ACAAT G CT CGAT C CT TT T CAGCACAAAT AT 369 

Qy 367 GGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCT 426 

II MM II M M I I I I I I I I I I I I I I I II MM I II I I I I 



370 G GCCAAC GAAT CGGTGGCTT GAT GT AT GT T CCAG CACT T CT T GGT GAAACAT T CT GGACA 429 

427 GCAGCAAT TTT CT CT GCTTTGGGAGCCACCAT CAGCGT GAT CATCGAT GTGGATAT GCAC 4 86 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

430 GCAGCCATTCTTTCGGCACTTGGTGCAACACTGTCGGTAATTCTTGGAATCGACATGAAT 489 

4 87 ATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTAT 54 6 

II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

490 G CAT C AGT GAC CCTGTCGGCCT GT AT T G C C GT AT T CT AC AC AT T C AC C G GT G GAT ACT AT 549 

547 TCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGGATCAGC 606 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I II 
550 GCAGTCGCGTACACTGACGTCGTTCAACTATTTTGCATTTTCGTCGGTTTGTGGGTTTGC 609 

607 GT C C C CT T TGCATT GT C ACAT C CT GCAGT C GCAGACAT C GGGTT C ACT G CT GT GCAT GC C 666 

I I I I II II III II I I I I I I I I I I 
610 GT GC C GGC GGCT AT GGT G CAT GAT G GT GCGAAGGAT AT T T C CAGGAAT GCAG 661 

667 AAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGGCTTGAT 726 

I I I I I I I I III I I I I I I I I I 

662 G CGACT GGAT T GGAGAGATT GGAG GAT T CAAAGAAAC AT CT CT C T GGAT T GAT 714 

727 AGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTC 78 6 

I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

715 TGCATGCTTCTCCTTGTCTTTGGAGGAATTCCATGGCAAGTGTACTTCCAAAGAGTTCTC 774 

7 87 TCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTG 846 

I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 

775 TCCTCAAAAACTGCTCATGGAGCACAGACGTTGTCGTTTGTGGCGGGCGTCGGATGCATT 834 

847 GT GAT GGC CAT C C C AG C CAT ACT CAT T GGGG C CAT T GG AGC AT C AACAGAC T GGAAC CAG 906 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

835 CT CAT GGC GAT T C C AC CAGCGT TGAT C GGT G CAAT T G C CAGGAACACAGACT GGAGAAT G 894 

907 ACTGCATATGGGCTTCC AGAT CCCAAGACTACAGAAGAGGCA 94 8 

I I I I I I I I I I I I I I I I I I I 

895 AC T GAT TAT T C C C CAT G G AAC AAT G G AAC T AAG GT C GAAT C GAT T C C AC C G G AT AAGAG A 954 

949 GACATGATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGT 1008 

I I I I I I III III I I I I I I I I I II I I I I I MM 

955 AACATGGTGGTCCCGTTGGTATTCCAGTATCTTACGCCAAGATGGGTCGCCTTTATTGGA 1014 

1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

II II I I I I I II I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I 

1015 CTCGGCGCAGTGTCGGCTGCTGTAATGTCATCTGCAGATTCATCTGTACTATCAGCAGCA 1074 

1069 TCCATGTTTGCACGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAA 1128 

I I I I I I I I M I I II I II I MM I I II I I I I I II I I II II 
1075 TC7VATGTTTGCTCACAACATCTGGAAGCTCACAATTCGCCCTCACGCGTCTGAAAAAGAA 1134 

112 9 ATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCC 1188 

I I II I II I I II I I I I II I Mill I I I I I I 

1135 GT GAT AATTGT GAT GAGAAT AGC CAT C AT CT GT GT T GGT AT CAT GGCAAC CAT CAT GGCA 1194 

1189 TTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTT 1248 

I I III I II I I II M II I II I I II I II I II Mill I 

1195 CTTACCATTCAATCCATCTATGGGCTTTGGTATCTTTGTGCAGATTTGGTCTACGTCATA 1254 



1249 ATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTG 1308 

I I I I I I I I I I I I I I I I I I I I I I I III M inn I II 

1255 CT CT T CC CT CAACT ATT AT GT GTT GT AT AT AT GC C AC GT AGCAAT ACGT AT GG CT CAT T G 1314 

Qy 1309 G CAGGTT AT GT TT CT GGC CT CT T C CT GAGAAT AAC T GGAGG GGAGCCAT AT CT GTAT CT T 1368 

I I I I I I I I I I II III I I I I I I I I I I I I I I I III II 

Db 1315 GCTGGCTATGCAGTCGGTCTTGTGCTCCGTTTGATTGGAGGCGAGCCACTTGTATCGCTG 1374 

Qy 1369 C AG C C C T T GAT CT T CT AC C CT GGCT AT T AC C CT GAT GAT AAT G GT AT AT AT AAT C AGAAA 1428 

I I I I I I I I I I I II III I I I I 

Db 1375 CCAGCGTTCTTC CAT T AT C C AAT G T AT AC G GAT G G G G TACAGTAT 1419 

Qy 1429 T T T C CAT T T AAAAC ACT T GC C AT GGT T AC AT CAT T CT T AAC C AAC AT TT G CAT CT C CT AT 14 88 

II I I I I I I III I I I I I I I I I I I I I I I I Ml 

Db 1420 TT CC CATT CAGGACAACT GCT AT GT TAT CTT CAAT GGCT ACTAT CTACATT GT AT CAATA 1479 

Qy 1489 C T AG C C AAGT AT CT AT T T GAAAGT G GAAC CT T G C C AC C T AAAT TAG AT GTAT T T GAT GCT 1548 

III I I I I I I I II III I I I I I I II I I I I I I I I I 

Db 1480 CAATCGGAGAAGCTGTTCAAATCGGGACGTTTGTCTCCGGAGTGGGACGTAATGGGTTGT 1539 

Qy 1549 GTTGT 1553 

II II 

Db 1540 GTAGT 1544 



Qy 

Db 



Search completed: March 22, 2004, 12:00:55 
Job time : 755 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic 



Run on: 



nucleic search, using sw model 
March 22, 2004, 11:12:56 



Title: 

Perfect score: 
Sequence : 



Search time 144 Seconds 

(without alignments) 

6717.218 Million cell updates/sec 



US-10-069-541-5 
1743 

1 atggctttccatgtggaagg ctgaagataatttacagtga 1743 



1365418 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 682709 seqs, 277475446 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq: * 

2 : /cgn2_6/ptodata/2/ina/5B COMB. seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB . seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_C0MB. seq: * 

5: /cgn2_6/ptodata/2/ina/PCTUS__COMB.seq: * 

6 : /cgn2_6/ptodata/2/ina/backf ilesl . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



fo. 


Score 


Match 


Length ] 


DB 


ID 


Description 


1 


1738.2 


99.7 


1743 


4 


US-09-657-252-1 


Sequence 1, Appli 


2 


47.6 


2.7 


7218 


1 


US-08-232-463-14 


Sequence 14, Appl 


3 


41 


2.4 


1857 


4 


US-09-640-198D-3 


Sequence 3, Appli 


4 


41 


2.4 


2839 


4 


US-08-595-553A-1 


Sequence 1, Appli 


5 


39.6 


2.3 


474 


4 


US-09-621-976-18033 


Sequence 18033, A 


6 


39.2 


2.2 


1506 


4 


US-09-328-352-2245 


Sequence 2245, Ap 


7 


39 


2.2 


558 


4 


US-09-328-352-3451 


Sequence 3451, Ap 


8 


38.2 


2.2 


2028 


4 


US-10-162-012-28 


Sequence 28, Appl 


9 


38.2 


2.2 


2326 


4 


US-10-162-012-26 


Sequence 26, Appl 


10 


38.2 


2.2 


1830121 


4 


US-09-557-884-1 


Sequence 1, Appli 


11 


38.2 


2.2 


1830121 


4 


US-09-643-990A-1 


Sequence 1, Appli 



c 
c 





12 


38 


2. 


2 


1932 


4 


US-09-640-198D-1 


Sequence 


1, Appli 




13 


37.4 


2. 


1 


4344 


4 


US-09-601-198-165 


Sequence 


165, App 


c 


14 


36.6 


2. 


1 


4160 


4 


US-09-134-218-1 


Sequence 


1, Appli 




15 


36. 6 


2. 


1 


148567 


4 


US-09-801-876B-3 


Sequence 


3, Appli 




16 


36. 6 


2. 


1 


148567 


4 


US-10-254-869-3 


Sequence 


3, Appli 




17 


35.6 


2. 


0 


447 


4 


US-09-621-976-12063 


Sequence 


12063, A 


c 


18 


35. 6 


2 . 


0 


2397 


4 


US-09-221-017B-272 


Sequence 


272, App 




19 


35.2 


2. 


0 


1461 


4 


US- 09-54 3-68 1A-2 066 


Sequence 


2066, Ap 




20 


35.2 


2. 


0 


2238 


1 


US-07-841-651-1 


Sequence 


1, Appli 




21 


34.8 


2. 


0 


902 


4 


US-09-671-317-37 


Sequence 


37, Appl 




22 


34.8 


2. 


0 


1593 


4 


US-09-134-001C-1673 


Sequence 


1673, Ap 


c 


23 


34. 8 


2. 


0 


12482 


4 


US-09-512-563C-25 


Sequence 


25, Appl 


c 


24 


34.8 


2. 


0 


25002 


4 


US-08-961-527-48 


Sequence 


48, Appl 




25 


34. 8 


2. 


0 


1664976 


4 


US-08-916-421B-1 


Sequence 1, Appli 


c 


26 


34.6 


2. 


0 


561 


4 


US-09-107-532A-3215 


Sequence 


3215, Ap 




27 


34. 6 


2. 


0 


1005 


4 


US-09-107-532A-3570 


Sequence 


3570, Ap 




28 


34.6 


2. 


0 


2847 


4 


US-09-484-970B-22 


Sequence 


22, Appl 


c 


29 


34. 6 


2. 


0 


1664976 


4 


US-08-916-421B-1 


Sequence 1, Appli 




30 


34.2 


2 


0 


1515 


4 


US-09-071-035-431 


Sequence 


431, App 




31 


34.2 


2 


0 


1803 


4 


US-09-071-035-429 


Sequence 


429, App 


c 


32 


34 


2 


0 


1109 


4 


US-08-956-171E-222 


Sequence 


222, App 


c 


33 


34 


2 


0 


392000 


4 


US-10-027-983-11 


Sequence 


11, Appl 


c 


34 


33. 8 


1 


9 


369 


4 


US-09-543-681A-628 


Sequence 


628, App 


c 


35 


33.8 


1 


.9 


3172 


1 


US-07-741-940-3 


Sequence 


3, Appli 


c 


36 


33.8 


1 


. 9 


3172 


1 


US-08-289-548A-3 


Sequence 


3, Appli 


c 


37 


33.8 


1 


.9 


3172 


1 


US-08-452-654-3 


Sequence 


3, Appli 


c 


38 


33.8 


1 


. 9 


3172 


1 


US-08-452-655B-3 


Sequence 


3, Appli 


c 


39 


33.8 


1 


.9 


3172 


3 


US-08-450-582-3 


Sequence 


3, Appli 


c 


40 


33.8 


1 


.9 


3172 


4 


US-08-449-731-3 


Sequence 


3, Appli 




41 


33.8 


1 


.9 


176373 


3 


US-09-128-155-17 


Sequence 


17, Appl 




42 


33.6 


1 


. 9 


84495 


4 


US-09-797-906-3 


Sequence 


3, Appli 




43 


33.4 


1 


.9 


1626 


4 


US-09-328-352-602 


Sequence 


602, App 




44 


33.4 


1 


. 9 


3593 


4 


US-09-404-627-3 


Sequence 


3, Appli 




45 


33.4 


1 


. 9 


4205 


4 


US-09-404-627-1 


Sequence 


1, Appli 



ALIGNMENTS 



RESULT 1 
US-09-657-252-1 

Sequence 1, Application US/09657252 
Patent No. 6500643 
GENERAL INFORMATION: 
APPLICANT: Wu, Dong-Hai 
APPLICANT: Gu, Yunrong 
APPLICANT: Millard, William 
APPLICANT: He, Yun-Je 

TITLE OF INVENTION: Human High Affinity Choline Transporter cDNA 
FILE REFERENCE: MBHB00-639 
CURRENT APPLICATION NUMBER: US/09/657 , 252 
CURRENT FILING DATE: 2000-09-07 
NUMBER OF SEQ ID NOS : 6 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 1 
LENGTH: 1743 
TYPE: DNA 



; ORGANISM: Homo sapiens 
; FEATURE : 

NAME/ KEY: CDS 

LOCATION: (1) . . (1743) 
US-09-657-252-1 

Query Match 99.7%; Score 1738.2; DB 4; Length 1743; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1740; Conservative 0; Mismatches 3; Indels 0; Gaps 0 

1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 AT GG CT TT C CAT GT GGAAG GACT GAT AGCT AT CAT C GT GT T CT AC CT T CT AAT T T T G CT G 60 

61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

| | | | | | | | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

61 GT T GGAAT AT GGGCT GC CT G GAGAAC CAAAAAC AGT GGC AGCG CAGAAGAGCG CAGC GAA 120 



Qy 

Db 

Qy 

Db 



Qy 121 GC CAT CAT AGT T G GT GGCC GAGAT AT T GGT T TAT T G GT T GGT GGAT T T AC CAT GACAGCT 180 

1 1 1 1 1 1 1 1 I I I 1 1 II I I 1 1 I 1 1 1 1 1 1 1 I 1 1 I 1 1 I I I I i I I I 1 1 1 1 1 1 1 1 1 1 I I I I 1 1 1 1 I 

Db 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 18 0 

Qy 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I 
Db 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

Qy 301 T T CT T T GC AAAAC CT AT G C GT T C AAAGG GGT AT GT GAC CAT GT T AGAC C C GT T T C AGC AA 360 

I M I II I I I I I I I I I I M II I I II I II II II II I I I I I I I I 

Db 301 T T CT T T GC AAAAC CT AT GC GT T C AAAGG GGT AT GT GAC CAT GT T AGAC C C GT T T C AG CAA 360 

Qy 361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGA7VATGTTC 42 0 

I I I I I I I I I I I I I II I I II II I I II II I I I I I I I I I I M I I II I I I I I I I II I I I I I I I I 
Db 361 AT CT AT GGAAAACGC AT G GGC GGACT C CT GTT TAT T CCT G CACT GAT GGGAGAAAT GTT C 420 

Qy 421 T GGGCT GC AGCAAT TTTCTCTGCTTTGG GAGC C AC CAT CAGC GT GAT CAT C GAT GT GGAT 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 T GGGCT GCAGCAAT TT T CTCT GCTT T GGGAG C CAC CAT C AGC GT GAT CAT C GAT GT GGAT 480 

Qy 4 81 AT GCACATT TCT GT CAT CAT CT CT GCACT CATT GCCACTCT GTACACACT GGTGGGAGGG 54 0 

1 1 1 I 1 1 II I 1 1 II I I 1 1 1 1 1 1 1 1 1 1 1 1 1 I i I 1 1 I 1 1 1 1 1 I I I I I 1 1 1 1 1 II I I I I 1 1 1 1 I 

Db 481 ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 AT C AGC GT C C C CT T T GC AT T GT CAC AT C CT GCAGT C GC AGAC AT C G GGT T C ACT GCT GT G 660 

Qy 661 CAT G CCAAAT AC CAAAAGCC GT GGCT GG GAACT GT T GACT CAT CT GAAGT CT ACT CTT GG 720 

| | | I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 



721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 7 80 

I I I I I I I I M I I I I II I I I I I M I I I I I I M I I I I I I I I I I I I I Ml 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

841 T GCCT GGT GAT GGCCAT CC CAGCCATACT C ATT GGGGCCATT GGAGCAT C AACAGACT GG 900 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
841 TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCCTCCACAGACTGG 900 

901 AAC C AG AC T G CAT AT G G G C T T C C AG AT C C C AAG AC T AC AG AAG AG G C AG AC AT GAT T T T A 960 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 AAC C AG ACT G CAT AT G G G C T T C C AG AT C C C AAG AC T AC AG AAG AG G C AG AC AT GAT T T T A 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

1021 T CT GCT GCT GT TAT GT CAT CAGCAGATT CTT CCAT CTT GT CAGCAAGTT CCAT GTTT GCA 1080 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I 

1021 TCTGCTGCTGTTATGT CAT CAGCAGATT CTT CCAT CTT GT CAGCAAGTT CCAT GTTT GCA 1080 

10 81 C G GAAC AT CT ACC AGCT T T C C T T C AGAC AAAAT G CT T C G GAC AAAGAAAT C GT T T GGGT T 1140 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
1081 C GGAACAT CT ACCAGCT T T C CT T CAGACAAAAT GCT T C G GACAAAGAAAT C GT TT GGGT T 1140 

1141 AT GCGAAT CACAGT GTTT GT GTTT GGAGCATCT GCAACAGC CAT GGC CTT GCT GACGAAA 1200 

| | | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 12 60 
| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG. 1260 

12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

1321 T CT GGCCT CTT CCT GAGAAT AACT GGAGGGGAGCCATATCT GT AT CT T CAGCCCTT GAT C 1380 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

1381 T T C T AC C CT G G CT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

| | | | | | | | I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1381 TTCTACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAA 1440 

1441 AC ACT T GCC AT GGT T ACAT CAT T CT T AAC C AAC AT T T GCAT CT C CT AT CT AGC C AAGT AT 1500 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1441 ACACTT GCCAT GGTTACAT CATT CTTAACCAACATTT GCATCT CCT ATCTAGCCAAGTAT 1500 

1501 CT AT T T GAAAGT GGAAC C T T G C C AC CT AAAT T AGAT GT AT T T GAT GCT GTT GT T GC AAGA 1560 

I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1501 CT ATT T GAAAGT GGAAC CT T GC CAC C TAAAT T AGAT GT AT T T GAT GCT GTT GTT GC AAGA 1560 



C AC AGT GAAGAAAAC AT GGATAAGACAAT T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

| | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

C AC AGT GAAGAAAAC AT GGATAAGACAAT T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT 1620 

Qy 1621 GAACT T GC ACT T GT GAAG C C AC GAC AGAG CAT GAC C CT C AGCT CAACT T T C AC CAAT AAA 1680 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GAACT T GCACT T GT GAAGCC AC GAC AGAG CAT GAC C CT CAGCT CAACTT T CAC CAAT AAA 1680 

Qy 1681 GAG GC CT T C CT T GAT GT T GAT T C C AGT C C AGAAGG GT CT GGGACT G AAGAT AAT T T AC AG 1740 

I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 1681 GAGGCCTT CCTT GATGTT GATT CCAGTCCAGAAGGGTCT GGGACT GAAGATAATT TACAA 1740 

Qy 1741 TGA 1743 

III 

Db 1741 TGA 1743 



Qy 1561 
Db 1561 



RESULT 2 

US-08-232-463-14 

; Sequence 14, Application US/08232463 

; Patent No. 5670367 

; GENERAL INFORMATION: 

APPLICANT: DORNER, F. 
; APPLICANT: SCHEIFLINGER, F. 
; APPLICANT: FALKNER, F. G. 

TITLE OF INVENTION: RECOMBINANT FOWLPOX VIRUS 
; NUMBER OF SEQUENCES: 52 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Foley & Lardner 

; STREET: 1800 Diagonal Road, Suite 500 

; CITY: Alexandria 

; STATE: VA 

COUNTRY: USA 

ZIP : 22313-0299 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/232 , 463 
; FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: US/07/935, 313 

FILING DATE: 

APPLICATION NUMBER: EP 91 114 300.6 
; FILING DATE: 26-AUG-1991 

ATTORNEY/AGENT INFORMATION: 
NAME: BENT, Stephen A. 
REGISTRATION NUMBER: 29,768 
REFERENCE/DOCKET NUMBER: 30472/114 IMMU 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (703)836-9300 
; TELEFAX: (703)683-4109 

TELEX: 899149 
INFORMATION FOR SEQ ID NO: 14: 



SEQUENCE CHARACTERISTICS: 

LENGTH: 7218 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

; IMMEDIATE SOURCE: 

; CLONE: pTZgpt-Fls 

US-08-232-463-14 

Query Match 2.7%; Score 47.6; DB 1; Length 7218; 

Best Local Similarity 6.0%; Pred. No. 0.00054; 

Matches 23; Conservative 200; Mismatches 159; Indels 0; Gaps 0; 

Qy 692 CTGTTGACTCATCTGAAGTCTACTCTTGGCTTGATAGTTTTCTGTTGTTGATGCTGGGTG 751 

Db 10 8 8 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 1147 

Qy 752 GAATCCCATGGCAAGCATACTTTCAGAGGGTTCTCTCTTCTTCCTCAGCCACCTATGCTC 811 

Db 114 8 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 12 07 

Qy 812 AAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTGGTGATGGCCATCCCAGCCATACTCA 871 

Db 1208 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 1267 

Qy 872 TTGGGGCCATTGGAGCATCAACAGACTGGAACCAGACTGCATATGGGCTTCCAGATCCCA 931 

» . . » ... . ■■ ..... ..... • • • • 

as .... ... • .. • « •••». ..... 

Db 12 68 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 1327 

Qy 932 AGACT ACAGAAGAGGCAGACAT GATTTTACCAATT GTT CT GCAGT AT CT CT GCCCT GT GT 991 

Db 1328 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 1387 

Qy 992 ATATTTCTTTCTTTGGTCTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTT 1051 

: ::::::::::: : : : : : : : : : : : : : : : : : : : : I II I II I I I 
Db 1388 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYGTACCAAATTCTT 1447 

Qy 1052 CCATCTTGTCAGCAAGTTCCAT 1073 

I I I I I I I I I I I I I I 

Db 1448 CTATCTCTTTAACTACTTGCAT 1469 



RESULT 3 

US-09-640-198D-3 

; Sequence 3, Application US/09640198D 

; Patent No. 6586411 

; GENERAL INFORMATION: 

; APPLICANT: Russell, Stephen 

; APPLICANT: Kay Whye, Peng 

; TITLE OF INVENTION: System for Monitoring the Location of 
; TITLE OF INVENTION: Transgenes 
; FILE REFERENCE: 07039-295001 

; CURRENT APPLICATION NUMBER: US/09/640, 198D 

; CURRENT FILING DATE: 2000-08-16 

; PRIOR APPLICATION NUMBER: US 60/149,168 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS : 34 



; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 1857 

TYPE: DNA 

ORGANISM: Rattus sp. 
US-09-640-198D-3 

Query Match 2.4%; Score 41; DB 4; Length 1857; 

Best Local Similarity 49.8%; Pred. No. 0.026; 

Matches 104; Conservative 0; Mismatches 105; Indels 0; Gaps 0; 
Qy 404 TGATGGGAGAAATGTTCTGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCG 463 



I I I I I I I I I I II I I I I I I I I I I I I I 

Db 413 T GGT G GC C AC GAT G CT GT ATAC AGG CAT C GT GAT CT AC GCGC CT GCGC T CAT C CT GAAC C 472 

Qy 464 T GAT CAT C GAT GT GGAT AT GCACAT TT CT GT CAT C AT CT CT GCACT CAT T GC C ACT CT GT 523 

I I I I I I I I I I II III I I I I I I I I I 

Db 473 AAGT GAC C GGGT T GGAC AT CT GGG C AT CGCTCCTGTC C ACAGGAAT CAT CT GCAC CTT GT 532 

Qy 524 ACACACTGGTGGGAGGGCTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCA 583 

I I I I I II I I I M I I I M I I I I I I I I I 

Db 533 ACACTACCGTGGGTGGTATGAAGGCCGTGGTCTGGACAGATGTGTTCCAGGTTGTGGTAA 592 

Qy 584 TTTTTGTAGGGCTGTGGATCAGCGTCCCC 612 



593 TGCTCGTTGGCTTCTGGGTGATCCTGGCC 621 



RESULT 4 

US-08-595-553A-1 

; Sequence 1, Application US/08595553A 
; Patent No. 6391579 
; GENERAL INFORMATION: 

; APPLICANT : NANCY CARRASCO, ET AL . 

TITLE OF INVENTION: THYROID SODIUM/IODIDE SYMPORTER AND 
TITLE OF INVENTION: NUCLEIC ACID ENCODING SAME 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: AMSTER, ROTHSTEIN & EBENSTEIN 

STREET: 90 PARK AVENUE 
CITY: NEW YORK 
; STATE: NEW YORK 

; COUNTRY: U.S.A. 

ZIP: 10016 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5 INCH 1.44 Mb STORAGE 
; MEDIUM TYPE: DISKETTE 

; COMPUTER: IBM PC COMPATIBLE 

OPERATING SYSTEM: MS-DOS 
SOFTWARE: ASCII 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/595 , 553A 

FILING DATE: FEBRUARY 1, 1996 
ATTORNEY/AGENT INFORMATION: 
NAME: CRAIG J. ARNOLD 
REGISTRATION NUMBER: 34,287 
; REFERENCE/ DOCKET NUMBER: 96700/393 



TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (212) 697-5995 

TELEFAX: (212) 286-0854 or 286-0082 
TELEX: TWX 710-581-4766 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2839 
; TYPE: NUCLEIC ACID 

; STRANDEDNESS : DOUBLE 

TOPOLOGY: LINEAR 
MOLECULE TYPE: 

DESCRIPTION: OLIGONUCLEOTIDE 
HYPOTHETICAL: NO 
; ANTI-SENSE: NO 
; ORIGINAL SOURCE: 

ORGANISM: RAT 

INDIVIDUAL ISOLATE: SODIUM/ IODIDE SYMPORTER 
US-08-595-553A-1 

Query Match 2.4%; Score 41; DB 4; Length 2839; 

Best Local Similarity 49.8%; Pred. No. 0.035; 

Matches 104; Conservative 0; Mismatches 105; Indels 0; Gaps 0; 

Qy 404 TGATGGGAGAAATGTTCTGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCG 463 

I I I I I I I I I I I I I I I I I I I 

Db 522 TGGTGGCCACGATGCTGTATACAGGCATCGTGATCTACGCGCCTGCGCTCATCCTGAACC 581 

Qy 464 T GAT CAT CGAT GT GGAT AT GCACATTT CT GT CAT CAT CT CT GCACT CATT GCCACT CT GT 523 

I I I I I I I I II II III I I I I I I I I Ml Ml 

Db 582 AAGTGACCGGGTTGGACATCTGGGCATCGCTCCTGTCCACAGGAATCATCTGCACCTTGT 641 

Qy 524 ACACACTGGTGGGAGGGCTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCA 583 

II Ill I I I I I I I I I Mill I I 

Db 642 ACACTACCGTGGGTGGTATGAAGGCCGTGGTCTGGACAGATGTGTTCCAGGTTGTGGTAA 701 

Qy 58 4 TTTTTGTAGGGCTGTGGATCAGCGTCCCC 612 

I I I I I I I I II I I I I II 

Db 702 TGCTCGTTGGCTTCTGGGTGATCCTGGCC 730 



RESULT 5 

US-09-621-976-18033 

; Sequence 18033, Application US/09621976 
; Patent No. 6639063 
; GENERAL INFORMATION : 

; APPLICANT: Dumas Milne Edwards, J.B. 

; APPLICANT: Jobert, S. 

; APPLICANT: Giordano, J.Y. 

TITLE OF INVENTION: ESTs and Encoded Human Proteins. 
; FILE REFERENCE: GENSET . 054PR2 
; CURRENT APPLICATION NUMBER: US/09/621,976 
; CURRENT FILING DATE: 2000-07-21 
; NUMBER OF SEQ ID NOS : 19335 
; SOFTWARE: Patent. pm 
; SEQ ID NO 18033 

LENGTH: 474 

TYPE: DNA 



; ' ORGANISM: Homo sapiens 

FEATURE : 
; NAME/ KEY: misc_feature 

LOCATION: 16 
; OTHER INFORMATION: n=a, g, c or t 
US-09-621-976-18033 



Query Match 2.3%; Score 39.6; DB 4 ; Length 474; 

Best Local Similarity 13.4%; Pred. No. 0.03; 

Matches 42; Conservative 134; Mismatches 138; Indels 0; Gaps 



Qy 


981 


CTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTTTCTGCTGCTGTTATGTCATC 

■ i t i iii i . 1*11* 1 * • • • • 

: : | : : : I I I 1 : : : : 1 1 1 : 1 : : : : 1 = 11- 1 • • • 

SKYCSGSYKKTTTTTTWAWWWTTTTKGKWARRRMSGGGKTTYMMCSKKKTKSCMAGRWKG 


1040 


Db 


54 


113 


Qy 


1041 


AGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCACGGAACATCTACCAGCTTTC 
: :::::: : |:: : : : : :: |: ==: : :: : I: 


1100 


Db 


114 


KYYYSRWYYYCYKGACYYMWKRWYCSSCCMMYTKGGGSMWTTTWMMRRRKKSYKRWTKGK 


173 


Qy 


1101 


CT T C AGAC AAAAT G CT T C GGAC AAAGAAAT C GT T T GGGT T AT G C GAAT C AC AGT GT T T GT 


1160 


Db 


174 


:: :::||: |: : 1 :|:: | | :| | :: : 

KKKKTTOMMAAMC YTTWRS YWIMMMRRAAAAKTYYYCMMSKTMCCMACCCMMCCMRRARS 


233 


Qy 


1161 


GT T T GGAGC AT CT G CAAC AGC CAT GGCCT T GCT GAC GAAAACT GT GT AT GGGCT CT GGTA 


1220 


Db 


234 


: ::: |: | : :: |::::: |: ::: :::::: :::|:| :s 
CCMRSCMRSYTYMMCYYYYMMYKGGRMYWWWRGGMWKRMYWMYKKKSMWKGSCMWKRAWW 


293 


Qy 


1221 


CCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGT 


1280 


Db 


294 


: : | : I : : | : : : : 1 : 1 : : : : : : : 1 : :::::::: 
ARKTTYYTWAWYYTTYYKRMCCYYMRKTTYCMMMWYSRWWRGSMWTARGAWWMCYWWYYY 


353 


Qy 


1281 


T AAG GGAAC CAAC A 1294 

II::: : :| 1 
MAARKKKYMWWAAA 367 




Db 


354 





RESULT 6 

US-09-328-352-2245 

; Sequence 2245, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC99-03PA 

; CURRENT APPLICATION NUMBER: US/ 09/ 328 , 352 
; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 2245 

LENGTH: 1506 
; TYPE: DNA 

ORGANISM: Acinetobacter baumannii 
US-09-328-352-2245 



Query Match 2.2%; Score 39.2; DB 4; Length 1506; 

Best Local Similarity 51.1%; Pred. No. 0.085; 



Matches 92; Conservative 0; Mismatches 88; Indels 0; Gaps 0; 

Qy 1008 TCTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAG 1067 

III I tit III I I I I I I I I I I I II I II I I I I I 

Db 1005 TCTAGCTGCTATTTTAGCTGCGGTTATGAGTACATTAAGCTGTCAGCTTTTGGTATGTTC 1064 

Qy 1068 T T CC AT GT T T GCAC G GAACAT C TAC C AGC T T T C C T T C AGAC AAAAT G C T T C GGACAAAGA 1127 

I I I I I I I I III I I I II I I I I I I I II I I 

Db 1065 AAGT GCACT AACT GAAGAT T T GT ACAAAGGCT T CATT C GT AAAAAT G CAT CT CAAAAAGA 1124 

Qy 1128 AATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGC 1187 

I I I I I I I I I I I I I I I I I I I I I I III I I I I I I III 

Db 1125 GCTTGTATGGGTTGGGCGTATCATGGTGCTTGCAATTGCCGTTCTAGCAATTGTGCTTGC 1184 



RESULT 7 

US-09-328-352-3451 

; Sequence 3451, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al • 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC99-03PA 

; CURRENT APPLICATION NUMBER: US/09/328 , 352 
; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 3451 

LENGTH: 558 
; TYPE: DNA 

; ORGANISM: Acinetobacter baumannii 
US-09-328-352-3451 

Query Match 2.2%; Score 39; DB 4; Length 558; 

Best Local Similarity 50.3%; Pred. No. 0.051; 

Matches 96; Conservative 0; Mismatches 95; Indels 0; Gaps 0; 

Qy 1455 TAC AT CAT T C T T AAC C AAC AT T T G CAT C T C C TAT C TAG C C AAGT AT C T AT T T G AAAGT G G 1514 

II I I II II Ml I Ml I M I II II I Ml Ml I 

Db 341 T AAAT CAAAAT GAT GCAAAT GCT T CAT GGCT GAT GTT GC AAACT T CAACT T T T CAAGAT G 400 

Qy 1515 AAC CT T GC C AC CT AAAT T AGAT GT AT T T GAT G CT GT T GT T GC AAGAC AC AGT GAAGAAAA 157 4 

I I I II II I I I I I I I I I I I I I I I I II I II 

Db 4 01 G C C GT AGT CAT CT GAAT GC GGCAAAGCT CAAGGT GAAGTTT CAGAAGCAAGCAGAT GGAA 4 60 

Qy 1575 C AT GGAT AAGAC AAT T C T T G T CAAAAAT GAAAAT AT T AAAT T AGAT G AAC T TG C AC T T GT 1634 

I I I I I I I I II I I Ml I I I I I I I I I I I I I I I I 

Db 4 61 CAT G G AAAAT TAAAC AT T T C C AAAC AC AG AAT AT T T T C AGT C GT C C G GT AT C G CAT T G G C 520 

Qy 1635 G AAG C C AC G AC 1645 

III I I I 
Db 521 AAAGT GAAG C C 531 



RESULT 8 

US-10-162-012-28 



Sequence 28, Application US/10162012 
Patent No. 6682597 
GENERAL INFORMATION: 

APPLICANT : Curtis, Rory A.J. 

APPLICANT : Silos-Santiago, Inmaculada 

APPLICANT: Gu, Wei 

TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

FILE REFERENCE: 10448-190001 

CURRENT APPLICATION NUMBER: US/10/162,012 

CURRENT FILING DATE: 2002-06-04 

PRIOR APPLICATION NUMBER: US 60/209,845 

PRIOR FILING DATE: 2000-06-06 

PRIOR APPLICATION NUMBER: US 09/875,321 
: PRIOR FILING DATE: 2001-06-06 
: PRIOR APPLICATION NUMBER: PCT/US01/ 18340 
; PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: US 60/209,257 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,423 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/ 18398 
} PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/209,238 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,363 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/18247 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/227,068 
; PRIOR FILING DATE: 2000-08-22 
; PRIOR APPLICATION NUMBER: US 09/928,530 
; PRIOR FILING DATE: 2001-08-13 
; PRIOR APPLICATION NUMBER: PCT/US01/2547 5 
; PRIOR FILING DATE: 2001-08-15 
; PRIOR APPLICATION NUMBER: US 60/226,770 
; PRIOR FILING DATE: 2000-08-21 
; PRIOR APPLICATION NUMBER: US 09/934,421 
; PRIOR FILING DATE: 2001-08-21 
; PRIOR APPLICATION NUMBER: PCT/US01/26096 
; PRIOR FILING DATE: 2001-08-21 
; PRIOR APPLICATION NUMBER: US 60/279,281 
; PRIOR FILING DATE: 2001-03-28 
; PRIOR APPLICATION NUMBER: US 10/109,029 
; PRIOR FILING DATE: 2002-03-28 
; PRIOR APPLICATION NUMBER: PCT/US02/ 0972 8 
; PRIOR FILING DATE: 2002-03-28 
; PRIOR APPLICATION NUMBER: US 60/290,288 
; PRIOR FILING DATE: 2001-05-11 
; PRIOR APPLICATION NUMBER: US (not assigned) 
; PRIOR FILING DATE: 2002-05-13 
; NUMBER OF SEQ ID NOS : 4 8 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 28 

LENGTH: 2028 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-162-012-28 



Query Match 2.2%; Score 38.2; DB 4; Length 2028; 

Best Local Similarity 46.1%; Pred. No. 0.22; 

Matches 239; Conservative 0; Mismatches 273; Indels 7; Gaps 



3; 



Qy 79 T GGAGAAC CAAAAAC AGT GGCAGC GCAGAAGAGCGCAG C GAAGCC AT C AT AGT T G GTGGC 138 

I I I I I I I I I I I I II I I I I I I I I I Ml I 

Db 12 6 T GGAC TAT GGT C CAC AGT GAAGACCAAAAGAGACACAGT GAAAGG CT ACT T C C T GGCT GA 185 

Qy 139 CGAGATATTGGTTTATTGGTTGGTGGATTTA-CCATGACAGCTACCTGGGTCGGAGGAGG 197 

III I I II I III I I I I I I I I II I I I I M Ml I II 

Db 18 6 AGGGAACATGGTGTGGTGGCCAGTGGGTGCATCCTTGTTTGCCAGCAATGTTGGAAGTGG 245 

Qy 198 GTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTAGCTTGGGCTCA 257 

MINIMI III I I I I II III I II I I I I I 

Db 24 6 ACATTTCATTGGCCTGGCAGGGTCAGGTGCTGCTACGGGCATTTCTGTA TCAGCTTA 302 

Qy 2 58 GGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTGCAAAACCTAT 317 

III I I II I II I II I II I I I I I I II I II M 

Db 303 TGAACTTAATGGCTTGTTTTCTGTGCTGATGTTGGCCTGGATCTTCCTACCCATCTACAT 362 

Qy 318 GCGTTCAAAGGGGT ATGT GACCAT GTTAGACCCGTTTCAGCAAAT CTAT GGAAAAC GCAT 377 

I Ml II I I I I III M III I III 

Db 363 T G C T G GT C AG GT CAC CAC GAT G C C AG AAT AC C T AC G G AAG CGCTTCGGTGG CAT C AGAAT 422 

Qy 378 GGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTGCAGCAATTTT 4 37 

I I I I I I I I I III I I I I I I II 

Db 423 CCCCATCATCCTGGCTGTACTCTACCTATTTATCTACATCTTCACCAAGATCTCGGTAGA 4 82 

Qy 4 38 CT CT GCT T T GGGAGCCAC CAT CAGCGT GAT CAT C G AT GT GGATAT GCACATT T CT GT 4 94 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 83 CAT GTAT G C AG GT GC CAT CTT C AT C CAGCAGT CT T CGC AC CT GGAT CT GT AC CT GGCCAT 542 

Qy 4 95 CATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTATTCTGTGGC 554 

I II I I I I I I I I I I I I I I I I I II 

Db 543 AGTTGGGCTACTGGCCATCACTGCTGTATACACGGTTGCTGGTGGCCTGGCTGCTGTGAT 602 

Qy 555 CTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGG 593 

I I I I I I I I I I I I I I I I Ml I I I I I I 

Db 603 CT ACAC GGAT G CCCT GCAGAC GCT GAT CAT GCT TAT AGG 641 



RESULT 9 

US-10-162-012-26 

; Sequence 26, Application US/10162012 

; Patent No. 6682597 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/ 10/ 162 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 



\ PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/1834 0 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18398 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,363 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18247 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; PRIOR APPLICATION NUMBER: US 09/928,530 

; PRIOR FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: PCT/US01/25475 

; PRIOR FILING DATE: 2001-08-15 

; PRIOR APPLICATION NUMBER: US 60/226,770 

; PRIOR FILING DATE: 2000-08-21 

; PRIOR APPLICATION NUMBER: US 09/934,421 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: PCT/US01/26096 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: US 60/279,281 

; PRIOR FILING DATE: 2001-03-28 

; PRIOR APPLICATION NUMBER: US 10/109,029 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: PCT/US02/09728 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: US 60/290,288 

; PRIOR FILING DATE: 2001-05-11 

PRIOR APPLICATION NUMBER: US (not assigned) 
; PRIOR FILING DATE: 2002-05-13 
; NUMBER OF SEQ ID NOS : 48 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 26 

LENGTH: 2326 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE : 

NAME/ KEY : CDS 

LOCATION: ( 178 )... (2202 ) 
US-10-162-012-26 



Query Match 2.2%; Score 38.2; DB 4; Length 2326; 

Best Local Similarity 46.1%; Pred. No. 0.24; 

Matches 239; Conservative 0; Mismatches 273; Indels 7; Gaps 3; 

Qy 7 9 T GGAGAACCAAAAACAGT GGCAGC GCAGAAGAGC GCAGCGAAGCCAT CATAGTT GGTGGC 138 

M I I I I M I I I III I I I I I II II I M I 

Db 303 T GGACTAT GGT C CACAGT GAAGAC CAAAAGAGACACAGT GAAAGGCT ACT T C CT GGCT GA 362 



Qy 



139 C GAGAT AT T GGT T T AT TG GTT GGT G GAT T TA- C CAT GAC AGCT AC CT GGGT C GGAGGAG G 197 



Db 



IN I I I I I III I I I I I I I I II I I I I II I I I I I I 

363 AGGGAACATGGTGTGGTGGCCAGTGGGTGCATCCTTGTTTGCCAGCAATGTTGGAAGTGG 422 



Qy 198 GTATAT CAATGGCACAGCT GAAGCAGTTT ATGTACCAGGTTATGGCCTAGCTT GGGCT CA 257 

I I I I I I II I III I II I II I II I II I MM 

Db 423 ACATTTCATTGGCCTGGCAGGGTCAGGTGCTGCTACGGGCATTTCTGTA TCAGCTTA 47 9 

Qy 258 GGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTGCAAAACCTAT 317 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 48 0 TGAACTTAATGGCTTGTTTTCTGTGCTGATGTTGGCCTGGATCTTCCTACCCATCTACAT 539 

Qy 318 GCGTTCT^AAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAAATCTATGGAAAACGCAT 377 

| Ml II I I I I III I I I I I I Ml 

Db 54 0 T GCT GGT CAGGT CAC CACGAT GCCAGAAT AC CTACGGAAGCGCTT CGGTGGCAT CAGAAT 599 

Qy 378 GGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTGCAGCAATTTT 437 

I I I I I I I I I III I I I I I I II 

Db 600 CCCCATCATCCTGGCTGTACTCTACCTATTTATCTACATCTTCACCAAGATCTCGGTAGA 659 

Qy 438 CTCTGCTTTGGGAGCCACCATCAGCGTGATCATCG AT GT GGATAT GCACATTTCT GT 494 

I I I I I I I I I II I I I I I 

Db 660 CATGTATGCAGGTGCCATCTTCATCCAGCAGTCTTCGCACCTGGATCTGTACCTGGCCAT 719 

Qy 4 95 CATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGG'GCTCTATTCTGTGGC 554 

| Ml II I MM II I I I I I I I I I I I I 

Db 72 0 AGTTGGGCTACTGGCCATCACTGCTGTATACACGGTTGCTGGTGGCCTGGCTGCTGTGAT 77 9 

Qy 555 CTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGG 593 

I I I I I I I II I I I I I I I Ml I I I I I I 

Db 78 0 CT AC ACGGAT GC CC T GCAGAC GCT GAT CAT G CT T AT AGG 818 



RESULT 10 
US-09-557-884-l/c 

; Sequence 1, Application US/09557884 
; Patent No. 6506581 

GENERAL INFORMATION: 
; APPLICANT: Fleischmann et al . 

; TITLE OF INVENTION: The Nucleotide sequence of 

; the Haemophilus influenzae Rd Genome , Fragments 

Thereof, and Uses Thereof 
NUMBER OF SEQUENCES: 1 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

CITY: Rockville 
STATE: MD 
COUNTRY: USA 
; ZIP: 20850 

COMPUTER READABLE FORM: 

MEDIUM TYPE: 3 1/2 inch diskette 
COMPUTER: Dell Pentium 
OPERATING SYSTEM: MS DOS v6.22 
SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/557 , 884 
; FILING DATE: 25-Apr-2000 



CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/476,102 
FILING DATE: JUN-5-1995 
ATTORNEY/AGENT INFORMATION: 
NAME: Michelle S. Marks 
REGISTRATION NUMBER: 41,971 
REFERENCE/ DOCKET NUMBER: PB18 6P3 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 301-309-8504 
TELEFAX: 301-309-8439 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1830121 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-09-557-884-1 

Query Match 2.2%; Score 38.2; DB 4; Length 1830121; 

Best Local Similarity 48.7%; Pred. No. 19; 

Matches 132; Conservative 0; Mismatches 138; Indels 1; Gaps 1; 

Qy 1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

I M I I I I I I I I I I I I I Ml I I 

Db 1428785 CT T T C C G CT AT T T TAG C AG C AGT AAT G AGT AC AT T AAGT G C G CAAT T GTT AAT T T C C T C T 

1428726 

Qy 1069 T C CAT GT T T G C AC GGAACAT CT AC C AGCT T T C CT T C AGAC AAAAT G CT T C G GAC AAAGAA 1128 

| | || I I I I I I 11 I II I I I I I I I I I I I I I 

Db 1428725 AGCTCAATCACAGAAGATTTCTATAAAGGTTTTATTCGCCCTAACGCATCT GAAAAAGAG 

1428666 

Qy 1129 AT CGTTT GGGTTAT GCGAAT CACAGT GT TT GT GTTT GGAGCAT CT GCAACAGCCAT GGC - 118 7 

I I I I II I I I I I I I I I I I I I I I Ml Ml I I I I I I I I 

Db 1428 665 CTCGTATGGCTTGGCAGAATTATGGTGTTAGTTATTGCCGCACTTGCTATCTGGATCGCA 

1428606 

Qy 1188 CTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGT 1247 

I I M I I I I I I I I I I I I I I I I I I I Ml 

Db 1428605 CAAGATGAAAACAGCAAAGTATTAAAACTTGTAGAATTTGCTTGGGCGGGGTTTGGTAGT 

1428546 

Qy 124 8 TATCTTCCCCCAGCTGCTTTGTGTACTCTTT 1278 

II III I II II I I I I I I 

Db 1428545 GCATTTGGCCCTGTTGTACTTTTCTCTCTTT 1428515 



RESULT 11 

US-09-643-990A-l/c 

; Sequence 1, Application US/09643990A 

; Patent No. 6528289 

; GENERAL INFORMATION: 

; APPLICANT: Robert D. Fleischmann 

; Mark D. Adams 

Owen White 



; Hamilton 0. Smith 

; J. Craig Venter 

; TITLE OF INVENTION: The Nucleotide sequence of 

; the Haemophilus influenzae Rd Genome, Fragments 

; Thereof, and Uses Thereof 

NUMBER OF SEQUENCES: 1 
CORRESPONDENCE ADDRESS : 
; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

CITY: Rockville, 
STATE: MD 
COUNTRY: USA 
ZIP: 20850 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: 3 1/2 inch diskette 

; COMPUTER: Dell Pentium 

OPERATING SYSTEM: MS DOS V6.22 
SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/643, 990A 

; FILING DATE: 23-Aug-2000 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/487,429 

; FILING DATE: 1995-06-07 

; APPLICATION NUMBER: 08/426,787 

FILING DATE: 1995-04-21 
ATTORNEY/AGENT INFORMATION: 
NAME: Kenley K. Hoover 
REGISTRATION NUMBER: 40,302 
; REFERENCE/DOCKET NUMBER: PB186P1C1 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 301-610-5790 
TELEFAX: 310-309-8439 
INFORMATION FOR SEQ ID NO: 1: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1830121 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: double 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-09-643-990A-1 

Query Match 2.2%; Score 38.2; DB 4 ; Length 1830121; 

Best Local Similarity 48.7%; Pred. No. 19; 

Matches 132; Conservative 0; Mismatches 138; Indels 1; Gaps 1; 

Qy 1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

III II III I I I I I I I I I II I II III I I 

Db 142 87 85 CT T T C C G CT AT T T TAG C AG C AGT AAT GAGT AC AT T AAGT G C G C AAT T GT T AAT T T C C T CT 
1428726 

Qy 1069 TCCATGTTTGCACGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAA 112 8 

I I II I MM I II 1 II II II II II I II I I 

Db 142 8725 AG C T C AAT C AC AG AAG AT T T C T AT AAAG GT T T TAT T C G C C C T AAC G C AT C T G AAAAAG AG 
1428666 



Qy 1129 ATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGC- 1187 

I I I I I I I I I I I II I I I I 

Db 142 8665 CTCGTATGGCTTGGCAGAATTATGGTGTTAGTTATTGCCGCACTTGCTATCTGGATCGCA 



1428606 

Qy 118 8 CTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGT 1247 

I I I I I I I I I I I I I I II I I I I I I I IN 

Db 142 8605 CAAGAT GAAAACAGC AAAGT AT T AAAACT T GT AGAATTT GCT T GGGC GGGGT T T G GT AGT 
1428546 

Qy 1248 TATCTTCCCCCAGCTGCTTTGTGTACTCTTT 127 8 

I I I I I I I I II II I I I I 
Db 1428545 GCATTTGGCCCTGTTGTACTTTTCTCTCTTT 1428515 



RESULT 12 
US-09-640-198D-1 

; Sequence 1, Application US/09640198D 

; Patent No. 6586411 

; GENERAL INFORMATION: 

; APPLICANT: Russell, Stephen 

; APPLICANT: Kay Whye, Peng 

; TITLE OF INVENTION: System for Monitoring the Location of 
; TITLE OF INVENTION: Transgenes 
; FILE REFERENCE: 07039-295001 

; CURRENT APPLICATION NUMBER: US/09/640, 198D 

; CURRENT FILING DATE: 2000-08-16 

; PRIOR APPLICATION NUMBER: US 60/149,168 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS : 34 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 

LENGTH: 1932 
; TYPE : DNA 

ORGANISM: Homo Sapiens 
US-09-640-198D-1 

Query Match 2.2%; Score 38; DB 4; Length 1932; 

Best Local Similarity 51.8%; Pred. No. 0.24; 

Matches 86; Conservative 0; Mismatches 80; Indels 0; Gaps 0 

Qy 437 T CT CT GCTTT GGGAGC CAC CAT C AGC GT GAT CAT CGAT GT GGAT AT GCAC ATT T CT GT C A 496 

Mill II I I I I I I I MM 

Db 446 TCTACGCACCGGCCCTCATCCTGAACCAAGTGACCGGGCTGGACATCTGGGCGTCGCTCC 505 

Qy 497 TCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTATTCTGTGGCCT 556 

| I I I I I I I Ml M I I I I I I II I I I II 

D b 506 TGTCCACCGGAATTATCTGCACCTTCTACACGGCTGTGGGCGGCATGAAGGCTGTGGTCT 565 

Qy 557 AC ACT GAT GT C GT T CAGCT CT T T T GC AT T T T T GT AGG GCT GT GGAT 602 

II M I I I I I M I I I I II I N I I I I I 

Db 566 GGACTGATGTGTTCCAGGTCGTGGTGATGCTAAGTGGCTTCTGGGT 611 



RESULT 13 
US-09-601-198-165 

; Sequence 165, Application US/09601198 



Patent No. 6531583 
GENERAL INFORMATION: 
APPLICANT: Cassell, Gail H. 
APPLICANT: Chen, Ellson Y. 
APPLICANT: Glass, Jennifer S. 
APPLICANT: Glass, John I. 
APPLICANT: Heiner, Cheryl R. 
APPLICANT: Lefkowitz, Elliot 

TITLE OF INVENTION: NUCLEIC ACID PROBES AND METHOD FOR DETECTING UREAPLASMA 
TITLE OF INVENTION: UREAL YT I CUM 
FILE REFERENCE: UAB-13452/22 
CURRENT APPLICATION NUMBER: US/09/601 , 198 
CURRENT FILING DATE: 2000-12-08 
PRIOR APPLICATION NUMBER: 60/073,189 
PRIOR FILING DATE: 1998-01-30 
NUMBER OF SEQ ID NOS : 181 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 165 
LENGTH: 4344 
TYPE: DNA 

ORGANISM: Ureaplasma urealyticum 
US-09-601-198-165 

Query Match 2.1%; Score 37.4; DB 4; Length 4344; 

Best Local Similarity 48.0%; Pred. No. 0.64; 

Matches 107; Conservative 0; Mismatches 116; Indels 0; Gaps 0; 



Qy 


1401 


T GAT GAT AAT G GT AT AT ATAAT C AGAAATT T C C AT TT AAAACACT T GC CAT GGT T ACAT C 

M Mil 1 1 1 1 1 Mill 1 III 

T GT T GAT C AAT C AGT AGAT T T T T T AAAAGT AAAT AT T GAAG CAT T AAT T AAT CAT CAAC C 


1460 


Db 


864 


923 


Qy 


1461 


AT T CT T AAC C AAC AT T T G CAT CT C CT AT CT AGC C AAGT AT CT AT T T G AAAGT GGAACCT T 

I ! MM 1 1 1 1 1 III III III II 1 1 1 Ml 

ACTTAAAAACACAACATGAAACGATTTTATTAACAAAAAT GTT AC AG AT ATT AGT GCTTT 


1520 


Db 


924 


983 


Qy 


1521 


GC C ACCT AAATT AGAT GT AT T T GAT GCT GTT GTT GCAAGAC AC AGT GAAGAAAAC AT GGA 

1 II M 1 II II 1 1 1 1 1 M 1 MM 

AAGTAACTTATTAGAAATTTTTGAAACTAATGAAATTACAAATAATGAATGAAACCAATT 


1580 


Db 


984 


1043 


Qy 


1581 


TAAGACAATT CTT GT CAAAAAT GAAAAT ATT AAATT AGAT GAA 1623 

1 1 1 1 M 1 III 1 II 1 1 1 II 1 1 MM 

AATTACGATTTTAATTAAT CAT GCACCTATTGATAAAATT GAA 1086 




Db 


1044 





RESULT 14 
US-09-134-218-l/c 

; Sequence 1, Application US/09134218A 

; Patent No. 6312926 

; GENERAL INFORMATION: 

; APPLICANT: Shatkin, Aaron J. 

; APPLICANT: Pillutla, Renuka 

; APPLICANT: Reinberg, Danny 

; APPLICANT: Yu, Zheng 

; APPLICANT: Moldanado, Edio. 

; TITLE OF INVENTION: mRNA CAPPING ENZYMES AND USES THEREOF 

; FILE REFERENCE: 601-1-079 ss 

; CURRENT APPLICATION NUMBER: US/09/134,218A 



CURRENT FILING DATE : 1998-08-14 
NUMBER OF SEQ ID NOS : 19 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 1 
LENGTH: 4160 
TYPE: DNA 

ORGANISM: Mus mus cuius 
US-09-134-218-1 

Query Match 2.1%; Score 36.6; DB 4; Length 4160; 

Best Local Similarity 46.6%; Pred. No. 1.1; 

Matches 117; Conservative 0; Mismatches 134; Indels 0; Gaps 0; 

Qy 1417 T AT AAT C AGAAAT T T C CAT T T AAAAC ACT T G C C AT GGT T AC AT CAT T C T T AAC C AAC ATT 1476 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I 

Db 2323 T AAAC T T AAAAT C CAACAT T T AAAAAACT CAAT AT GCTT ACAGCT T C AGAT T G CT AGT T A 2264 

Qy 1477 T G CAT C T C C TAT C TAG C C AAGT AT CT AT T T GAAAGT G GAAC CT T GC C AC C T AAAT T AGAT 1536 

I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 2263 TGAAT CAAAT GTAAAGGTAT CTATTCACATACAAACAGGCT CTATTTCATTAACTT CAAT 2204 

Qy 1537 GT AT T T GAT GC T GT T GT T G C AAGAC AC AGT GAAGAAAACAT GGAT AAGAC AAT T CT T GT C 1596 

I I I I I I I I I I I I I I I I I I I I I Mill 

Db 2203 CTGATTTAACCTTTGGGTATTTCAATCTGTAGACTCCACAGGGTAAGGCTGAATTTATTC 2144 

Qy 1597 AAAAAT G AAAAT AT TAAAT T AGAT GAACT T G C ACT T GT GAAGC C AC GAC AGAG CAT GAC C 1656 

I I I I I I I I I I I I I I I II I I I I I I III I 

Db 2143 AGGTATAAATAAAATATTT AGGT CCAT GATGTACTGTAGTT CCAAGGAAAC CAAAT GTAC 2084 

Qy 1657 CTCAGCTCAAC 1667 

I I I I I 
Db 2083 CAAATATATAC 2073 



RESULT 15 
US-09-801-876B-3 

; Sequence 3, Application US/09801876B 

; Patent No. 6492155 

; GENERAL INFORMATION: 

; APPLICANT: YE, Jane et al 

; TITLE OF INVENTION: ISOLATED HUMAN KINASE PROTEINS, NUCLEIC 

; TITLE OF INVENTION: ACID MOLECULES ENCODING HUMAN KINASE PROTEINS, AND USES 
; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: CL001160 

; CURRENT APPLICATION NUMBER: US/09/801 , 876B 
; CURRENT FILING DATE: 2001-03-09 
; NUMBER OF SEQ ID NOS: 8 

; SOFTWARE: Fast SEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 148567 

TYPE: DNA 

ORGANISM: Human 

FEATURE : 

NAME/ KEY : misc_feature 
LOCATION: ( 1 ) . . . ( 14 8 567 ) 
OTHER INFORMATION: n = A,T,C or G 
US-09-801-876B-3 



Query Match 2.1%; Score 36.6; DB 4; Length 148567; 

Best Local Similarity 46.1%; Pred. No. 12; 

Matches 123; Conservative 0; Mismatches 144; Indels 0; Gaps 



0; 



Qy 1416 ATATAAT CAGAAATTTC CATTTAAAACACTT GCCATGGTTACATCATT CTT AA.CCAACAT 1475 

I I I I I I I I I Ml IN M II M II I I I I I I I I 

Db 31673 AGAT AAT C AGT T GT T T TAACT T T T AAT T T AAGC AGT AGC AGAAT GACT T T T T GG GAACT T 31732 

Qy 1476 T T GC AT CT C CT AT CT AGCCAAGT AT CT ATTT GAAAGT G GAAC CTT GC C AC CT AAAT T AGA 1535 

I I I I II I I I I I I I I IN II I II II 

Db 31733 AGGAATTT GGAAAC CTTTTTATTCTATGTATT GAATAT CAACTAT GTAATTTAGT CTAAG 31792 

Qy 1536 T GTATTTGATGCT GTT GTT GCAAGACACAGT GAAGAAAACAT GGAT AAGACAAT T CTT GT 1595 

I I I I I I II I I I I I I I I I II I I I I I I I I I 

Db 317 93 GTT AT AT G CT AGAAACAT TT CAAAAAC GAAAGCAGCAGCAAT GACAT CAAAAAT G CAT GT 31852 

Qy 1596 CAAAAAT GAAAAT AT T AAAT T AGAT GAAC T T GCAC T T GT GAAGC C AC GAC AGAGC AT GAC 1655 

Mill II I I I I I I I I Ml M Ml I IN 

Db 31853 CAAAAG CAAAT GGT T T T AAAT AGAAAT ACAT CAT T TT AACAAT CT T GAAGT T T AAAAGAT 31912 

Qy 1656 CCTCAGCTCAACTTTCACCAATAAAGA 1682 

III II III I II II 

Db 31913 C CT AT AAAAAT C ACAAAC C C AGAAGGA 31939 



Search completed: March 22, 2004, 15:19:32 
Job time : 161 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic 



Run on: 



nucleic search, using sw model 
March 22, 2004, 11:50:14 ; 



Search time 635 Seconds 

(without alignments) 

10153.739 Million cell updates/sec 



Title: US-10-069-54 1-5 

Perfect score: 1743 
Sequence : 

Scoring table: 



1 atggctttccatgtggaagg ctgaagataatttacagtga 1743 

IDENTTTY_NUC 
Gapop 10.0 , Gapext 1.0 



Searched: 



2438257 seqs, 1849576744 residues 



Total number of hits satisfying chosen parameters: 



4876514 



Minimum DB seq length: 
Maximum DB seq length: 



0 

2000000000 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Published_Applications_NA: 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



/cgn2_6/ptodata/l/pubpna/US07_PUBCOMB. seq: * 
/cgn2_6/ptodata/l/pubpna/PCT_NEW_PUB. seq: * 
/cgn2_6/ptodata/l/pubpna/US06_NEW_PUB.seq: * 
/cgn2_6/ptodata/l/pubpna/US06_PUBCOMB.seq:* 
/cgn2_6/ptodata/l/pubpna/US07_NEW_PUB.seq: * 
/ cgn2_6/ptodata/ 1 /pubpna/ PCTUS^PUBCOMB . seq : * 
/cgn2_6/ptodata/l/pubpna/US08_NEW_PUB.seq:* 
/cgn2_6/ptodata/l/pubpna/US08_PUBCOMB. seq: * 
/cgn2_6/ptodata/l/pubpna/US09A_PUBCOMB.seq:* 
/cgn2_6/ptodata/l/pubpna/US09B_PUBCOMB.seq: 
/cgn2_6/ptodata/l/pubpna/US09C_PUBCOMB.seq: 
/cgn2_6/ptodata/l/pubpna/US09_NEW_PUB.seq:* 
/cgn2__6/ptodata/l/pubpna/US10A_PUBCOMB.seq: 
/cgn2_6/ptodata/l/pubpna/US10B_PUBCOMB.seq: 
/cgn2_6/ptodata/l/pubpna/US10C_PUBCOMB.seq: 
/cgn2_6/ptodata/l/pubpna/US10_NEW_PUB.seq: + 
/cgn2_6/ptodata/l/pubpna/US60_NEW_PUB. seq: * 
/cgn2_6/ptodata/l/pubpna/US60_PUBCOMB. seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 





1 


1743 


100. 


0 


1743 


10 


US-09-911-077A-1 


Sequence 1, Appli 




2 


1743 


100. 


0 


1813 


10 


US-09-911-077A-9 


Sequence 9, Appli 




3 


1394.2 


80. 


0 


4904 


10 


US-09-911-077A-5 


Sequence 5, Appli 




4 


1375 


78. 


9 


1743 


10 


US-09-911-077A-3 


Sequence 3, Appli 




5 


1375 


78. 


9 


1743 


10 


US-09-911-077A-23 


Sequence 23, Appl 




6 


630.8 


36. 


2 


119040 


10 


US-09-911-077A-19 


Sequence 19, Appl 




7 


630.8 


36. 


2 


142299 


10 


US-09-911-077A-14 


Sequence 14, Appl 




8 


376.6 


21. 


6 


1833 


12 


US-10-241-784-1 


Sequence 1, Appli 




9 


363.8 


20. 


9 


1985 


10 


US-09-911-077A-7 


Sequence 7, Appli 




10 


242.6 


13. 


9 


1461 


9 


US-09-974-300-501 


Sequence 501, App 


c 


11 


180.8 


10. 


4 


119040 


10 


US-09-911-077A-19 


Sequence 19, Appl 


c 


12 


180.8 


10. 


4 


142299 


10 


US-09-911-077A-14 


Sequence 14, Appl 




13 


155 


8. 


9 


455 


9 


US-09-864-761-1838 


Sequence 1838, Ap 


c 


14 


118.6 


6. 


8 


943 


15 


US-10-027-632-12 0553 


Sequence 120553, 




15 


72 


.4. 


1 


96 


9 


US-09-864-761-18589 


Sequence 18589, A 




16 


60 


3. 


4 


60 


10 


US-09-908-975-10249 


Sequence 10249, A 




17 


53.8 


3. 


1 


65 


10 


US-09-908-975-26842 


Sequence 26842, A 




18 


41 


2. 


4 


1857 


15 


US-10-428-868-3 


Sequence 3, Appli 




19 


41 


2. 


4 


2839 


9 


US-09-995-007-1 


Sequence 1, Appli 


c 


20 


39.8 


2. 


3 


666 


15 


US-10-027-632-137101 


Sequence 137101, 




21 


39.6 


2. 


3 


2028 


9 


US-09-733-630-1 


Sequence 1, Appli 




22 


39.6 


2. 


3 


2456 


9 


US-09-733-630-3 


Sequence 3, Appli 


c 


23 


39.4 


2. 


3 


578 


15 


US-10-027-632-192644 


Sequence 192644, 




24 


38.2 


2. 


2 


2028 


9 


US-09-928-530-3 


Sequence 3, Appli 




25 


38.2 


2. 


2 


2028 


14 


US-10-162-012-28 


Sequence 28, Appl 




26 


38.2 


2. 


2 


2028 


15 


US-10-162-102-28 


Sequence 28, Appl 




27 


38.2 


2. 


2 


2326 


9 


US-09-928-530-1 


Sequence 1, Appli 




28 


38.2 


2. 


2 


2326 


14 


US-10-162-012-26 


Sequence 2 6, Appl 




29 


38.2 


2. 


2 


2326 


15 


US-10-162-102-26 


Sequence 2 6, Appl 


c 


30 


38.2 


2. 


2 


1830121 


14 US-10-329-960-1 


Sequence 1, Appl; 


c 


31 


38.2 


2. 


2 


1830121 


15 US-10-329-670-1 


Sequence 1, Appl: 




32 


38 


2. 


2 


650 


15 


US-10-027-632-19 0544 


Sequence 190544, 




33 


38 


2. 


2 


1932 


15 


US-10-428-868-1 


Sequence 1, Appli 


c 


34 


37.8 


2. 


2 


6306 


14 


US-10-239-676-129 


Sequence 129, App 


c 


35 


37.6 


2. 


2 


867 


12 


US-10-142-426-20 


Sequence 2 0, Appl 


c 


36 


37.6 


2. 


2 


867 


14 


US-10-123-155-20 


Sequence 20, Appl 


c 


37 


37.6 


2. 


2 


867 


14 


US-10-146-731-20 


Sequence 20, Appl 


c 


38 


37.6 


2. 


2 


867 


14 


US-10-140-472-20 


Sequence 20, Appl 


c 


39 


37.6 


2. 


,2 


867 


14 


US-10-141-761-20 


Sequence 20, Appl 


c 


40 


37.6 


2. 


,2 


867 


14 


US-10-142-885-20 


Sequence 20, Appl 


c 


41 


37.6 


2. 


,2 


867 


14 


US-10-158-790-20 


Sequence 20, Appl 


c 


42 


37.6 


2, 


,2 


8 67 


15 


US-10-137-871-20 


Sequence 20, Appl 


c 


43 


37.6 


2, 


.2 


867 


15 


US-10-140-923-20 


Sequence 20, Appl 


c 


44 


37.6 


2, 


.2 


867 


15 


US-10-141-756-20 


Sequence 20, Appl 


c 


45 


37.6 


2. 


.2 


867 


15 


US-10-141-759-20 


Sequence 20, Appl 



ALIGNMENTS 



RESULT 1 

US-09-911-077A-1 

; Sequence 1, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 



; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 1 

; LENGTH: 1743 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE : 

NAME/ KEY: CDS 
; LOCATION: (1) . . (1743) 
US-09-911-077A-1 



Query Match 100.0%; Score 1743; DB 10; Length 1743; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


60 


Db 


1 


| | I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 

ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 


60 


Qy 


61 


GT T GGAAT AT GGGCTGCCT GGAG AAC C AAAAAC AGT GGC AGC GC AG AAGAGC G C AGC GAA 


120 




| I | | 1 1 1 I 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 | 1 1 | L 1 1 

| | | | | | | | | | | | I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


GT T GGAAT AT GGGCT GCCT GGAGAAC CAAAAAC AGT GGCAGC GCAGAAGAGC GCAGC GAA 


120 


Qv 


121 


GC CAT CAT AGT T G GT GG C C GAGATAT T G GT T TAT T G GT T GGT GGAT T T AC CAT GAC AGCT 


180 




I I I I I I 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


GC CAT CAT AGT T GGT GG C C GAGAT ATT G GT TT ATT G GTT G GT GGAT T T AC CAT GAC AGCT 


180 


Qy 


181 


AC CT GGGT C GGAGGAG G GT AT AT CAAT G G CAC AG CT GAAG C AGT T TAT GT AC C AGGT TAT 


240 




1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 




Db 


181 


AC CT GGGT CGGAGGAGGGT AT AT CAAT GGCAC AGCT GAAGC AGTT T AT GT AC C AGGT TAT 


240 


Qy 


241 


GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 


300 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 


300 


Qy 


301 


T T C T T T G C AAAAC C TAT G C GT T C AAAG G G GT AT GT GAC CAT GT TAG AC C C GT T T C AG C AA 


360 




1 1 M 1 1 II 1 1 II 1 1 1 II 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


T T CT T T GCAAAACCT AT GCGT T CAAAG G G GT AT GT GAC CAT GT T AGAC C CGTTT C AGCAA 


360 


Qy 


361 


ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 


420 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


AT CT AT GGAAAACGCAT GGGC GGACT CCT GT T TAT T C CT GC ACT GAT G G GAGAAAT GT T C 


420 


Qy 


421 


TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 


480 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 


480 


Qy 


481 


ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 


540 




| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


AT G CAC ATTT CT GT CAT CAT CT CT GCACT CAT T GC CACT CT GT ACAC ACT GGT GGGAG GG 


540 



541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I M I I I I I I I I I II II I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M N I I I I I I I I I I I I I I I I I I I I 

601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTG7\AGTCTACTCTTGG 720 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 72 0 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 7 80 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I M I I I I I I I I I I I I I I I I I I 

721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 7 80 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 8 40 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 8 40 

841 T G C CT G GT GAT GGC C AT C C C AG C CAT ACT CAT T GGGGC CAT T GGAGC AT C AAC AGACT GG 900 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 T GC CT G GT GAT GGC CAT CCC AGC CAT ACT CATT GGGGC CAT T GGAG CAT CAAC AGACT GG 900 

901 AAC C AGACT G C AT AT GGGCT T C C AGAT C C C AAGACT AC AGAAGAGGC AGAC AT GAT T T T A 960 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 AAC CAGACT G CAT AT G GGCT T C C AGAT C C C AAGACT AC AGAAGAGGC AGAC AT GAT T T T A 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 102 0 

I | | | | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 

1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

1081 C GGAAC AT CT AC CAGCT T T CCT T CAGACAAAAT GCT T CGGACAAAGAAAT CGT TT GGGTT 1140 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 C GGAACAT CT AC CAGCT T T CCT T C AGAC7\AAAT GCT T CG GACAAAGAAAT CGT TT GGGTT 1140 

1141 AT G C G AAT C AC AGT GTTTGTGTTTG GAG CAT C T G CAAC AG C CAT GGCCTTGCT GAC GAAA 1200 

I 1 1 1 1 1 1 1 1 1 1 I 1 1 I 1 1 1 1 1 I I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I 1 1 I I 1 1 1 1 1 

1141 AT GC GAAT CACAGT GTT T GT GT T T GGAG CAT CT GCAACAGCC AT GGCCT T GCT GAC GAAA 1200 

12 01 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

| | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

12 61 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I 

1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

| | I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 
1381 T T C T AC C CT G G CT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

1381 TT CT AC C CT GGCT ATT AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAATT T C CAT T T AAA 1440 



Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 




1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 



ACACT T GCC AT GGT T ACAT C ATT CT T AAC CAACATT T GC AT CT C CT AT CT AGC CAAGT AT 1500 
I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AC ACT T G C CAT GGT T AC AT CAT T C T T AAC C AAC AT T T G C AT C T C CT AT C T AGC CAAGT AT 1500 

CT AT T T G AAAGT G GAAC CT T GC C AC C T AAAT T AGAT GT AT T T GAT GCTGTTGTT GC AAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CT ATTT GAAAGT GGAAC CT T GC CAC CT AAAT TAGAT GT AT TT GAT GCT GT T GT T GCAAGA 1560 

CACAGT GAAGAAAACAT GGATAAGACAATT CTT GT CAAAAAT GAAAAT ATT AAAT T AGAT 1620 
I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CACAGT GAAGAAAACAT G GAT AAG AC AAT T CT T GT CAAAAAT GAAAAT AT T AAAT TAGAT 1620 

GAACTT GCACTT GT GAAGCCACGACAGAGCATGACC CTCAGCT CAACTTT CACCAATAAA 1680 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAACT T GCACT T GT GAAGC CAC GAC AGAGC AT GACC CT C AGCT C AACTT T CAC CAAT AAA 168 0 

GAG GCCTTCCTT GAT GT T GAT T C C AGT C C AGAAG G GT C T G G G AC T G AAG AT AAT T T AC AG 17 40 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAGGCCTT CCTT GATGTT GATT CCAGT CCAGAAGGGT CT GGGACTGAAGATAATTTACAG 17 4 0 



I I I 



RESULT 2 

US-09-911-077A-9 

; Sequence 9, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 9 

LENGTH: 1813 
; TYPE: DNA 

ORGANISM: Homo sapiens 

FEATURE : 

NAME/ KEY: CDS 
; LOCATION: (19).. (1761) 
US-09-911-077A-9 

Query Match 100.0%; Score 1743; DB 10; Length 1813; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1743; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 
II I I I I I I I I I I I I MINIM I I I I I I I I M I I I I 



Db 


19 


Qy 


61 


Db 


79 


Qy 


121 


Db 


139 


Qy 


181 


Db 


199 


Qy 


241 


Db 


259 


Qy 


301 


Db 


319 


Qy 


361 


Db 


379 


Qy 


421 


Db 


439 


Qy 


481 


Db 


499 


Qy 


541 


Db 


559 


Qy 


601 


Db 


619 


Qy 


661 


Db 


679 


Qy 


721 


Db 


739 


Qy 


781 


Db 


799 


Qy 


841 


Db 


859 



ATGGCTTTCCATGTGG7VAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 7 8 

GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

I I II I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I 

GT T GGAAT AT GG G CT G CCT GGAGAAC CAAAAACAGT GGC AGC GCAGAAGAGC G CAGC GAA 138 

G C CAT CAT AG TTGGTGGCC GAG AT AT T G GT T TAT TGGTTGGTG GAT T T AC CAT G AC AG C T 180 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 198 

ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 24 0 
I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I 
ACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTAT 25 8 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 318 

TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I | | | | | I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I i I I I I I I I I 

T T CT T T G C AAAAC CT AT G C GT T C AAAG G G GT AT GT GAC CAT GT T AGAC C C GT T T C AG C AA 37 8 

ATCTAT GGAAAACGCATGGGCGGACT CCTGTTT ATTCCT GCACTGATGGGAGAAAT GTT C 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 438 

TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 48 0 

I M I I I I I I I I I II I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I ! I I I M I I 

TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 98 

ATGCACATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGG 54 0 

I | i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AT GC AC AT T TC T GT CAT CAT CT CT G C AC T CAT T GC C ACT CT GT ACAC ACTG GT GGGAGGG 558 

CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 60 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 618 

AT CAGC GT CCCCTTTG CAT T GT C AC AT C CT G C AGT C GCAGAC AT C GGGT T C ACT GCT GT G 660 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 678 

CAT GC CAAAT AC CAAAAGCC GT GGC T GG GAAC T GTT GACT C AT CT GAAGT CT ACT CTT GG 720 
I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 738 

CTT GAT AGTTT T CT GTT GT T GAT GCT GG GT GGAAT C C CAT G GCAAGC AT ACT TT C AGAGG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 7 98 

GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 858 

TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 900 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I 

TGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTGGAGCATCAACAGACTGG 918 



Qy 

Db 



901 AAC C AGACT GC AT AT GGG CT T C C AG AT C C C AAGACT AC AGAAGAGGC AG AC AT GATT T T A 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
919 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 978 



Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I 
Db 97 9 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1038 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I 
Db 1039 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1098 

Qy 1081 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1099 CGGAACATCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTT 1158 

Qy 1141 AT GC GAAT CAC AGT GT T T GT GT T T GGAGC AT CT GC AAC AG C CAT GGCCTTGCT GAC GAAA 1200 

II I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

Db 1159 AT GCGAAT CACAGT GTTT GT GTT T GGAGCAT CT GCAACAGCCAT GGCCTT GCT GAC GAAA 1218 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1219 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1278 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I II I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1279 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1338 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1339 T CT GGC CTCTTCCT GAGAAT AACT GGAGGGGAGC CAT AT C TGT AT C T T C AGC C C T T GAT C 1398 

Qy 1381 T T CT AC C CT G GCT AT T AC C CT GAT G AT AAT GGT AT AT AT AAT CAGAAAT T T C CAT T T AAA 1440 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I II I I 
Db 1399 T T CTACC CT G GCT AT T AC C CT GAT GAT AAT GGT AT AT AT AAT CAGAAAT T T C CAT T T AAA 1458 

Qy 1441 ACACT T G C CAT GGT T ACAT CAT T CT T AAC CAAC AT T T GC AT CT CCT AT CT AGC CAAGT AT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1459 AC AC T T G C CAT GGT T AC AT CAT T C T T AAC CAAC AT T T G CAT C T C C TAT C TAG C CAAGT AT 1518 

Qy 1501 CTAT T T GAAAGT G GAAC CT T GC CAC CT AAAT T AGAT GT AT T T GAT GCT GT T GT T G CAAGA 1560 

i m 1 1 1 it 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1519 CT ATTTGAAAGT GGAACCTTGC CAC CT AAATTAGAT GT ATTT GAT GCT GTT GTT GCAAGA 1578 

Qy 1561 CACAGT GAAGAAAACAT GGAT AAGAC AAT T CTT GT CAAAAAT GAAAAT ATT AAAT T AGAT 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1579 CACAGTGAAGA7\AACATGGATAAGACAATTCTTGTCAAAAATGAAAATATTAAATTAGAT 1638 

Qy 1621 GAAC T T GC ACT T GT GAAGC CAC GAC AGAGC AT GAC CCT CAGC T CAAC T T T CAC CAAT AAA 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I 
Db 1639 GAACT T GCACT T GT GAAGCCAC GACAGAGCAT GAC CCT CAGCT CAACTT T CAC CAAT AAA 1698 

Qy 1681 G AGGC CT T C CT T GAT GT T GAT T C CAGT C C AGAAG GGT CT GGGACT GAAG AT AAT T T AC AG 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I M I I II I I I I I 

Db 1699 GAGGCCTT CCTT GAT GTT GATT CCAGT CCAGAAGGGT CT GGGACT GAAGATAATTTACAG 1758 



Qy 1741 TGA 1743 

I I I 

Db 1759 TGA 1761 



RESULT 3 

US-09-911-077A-5 

; Sequence 5, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 5 

; LENGTH: 4904 

TYPE: DNA 
; ORGANISM: Rattus norvegicus 

FEATURE : 
; NAME/ KEY: CDS 
; LOCATION: (224 )..( 1966) 
US-09-911-077A-5 

Query Match 80.0%; Score 1394.2; DB 10; Length 4904; 

Best Local Similarity 87.5%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 218; Indels 0; Gaps 0; 

Qy 1 AT GG CT T T C CAT GT G GAAG GACT GAT AG CTAT CAT C GT GT T CT AC CT T CTAAT T T T GCT G 60 

III I I I I I I I I I I I I I I I I II I I I I II III I I I II I I II II II II I I II I 

Db 224 AT GC CT T T C CAT GT AGAAGGACT AGT AGC GAT TAT C CT GT T CT AC CT T CT T AT AT T T CT G 283 

Qy 61 GTT GGAAT AT GGGCT GC CT GGAGAAC CAAAAAC AGT GGCAGCGCAGAAGAGC GCAGCGAA 120 

I I I I I I I I I 1 1 I I I I I I (IN I I I I I I I I I I I I II I I I I I I I I I II I I I II II 

Db 284 GT T GGAAT AT GGGCT GCAT GGAAAAC CAAAAACAGC GGTAAT GCAGAAGAAC GCAGCGAA 343 

Qy 121 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCT 180 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I II II I I I M 

Db 344 GCCATCATAGTTGGGGGCCGAGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 403 

Qy .181 AC CT G GGT C GGAGGAGGGT AT AT C AAT GGC AC AG CT GAAG C AGT T TAT GT AC CAGGT TAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 404 ACCTGGGTTGGAGGAGGTTACATCAACGGGACAGCTGAAGCAGTTTATGGGCCAGGTTGT 463 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 464 GGTCTAGCTTGGGCTCAGGCACCCATTGGATATTCTCTGAGTCTGATTTTAGGTGGCCTG 523 

Qy 301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

I | I I I I I I I I I I I I M I I I I II Mill I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 524 T T T T T T G C AAAAC CTATGCGTTC C AAG G GAT AT GT GAC T AT GT T AGAC C C GT T T C AAC AG 583 



Qy 



361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 



Db 



I I I I I I I I I I I MINIM I I I I M I M I I M M M M M I M M M M M I I 

584 AT C TAT GGAAAGC GC AT GGGTG G GC T GC T GT T CAT C C CT GCAC T GAT GGGAGAGAT GTT C 643 



Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 4 80 

I M M M M M M M M M M M II I I I I M M M M M I M M I M M M M I 

Db 644 T GG GCT G CAGCAAT T TT CT CT GCAT TAG GG GCT AC CAT CAG C GT AAT CAT T GAT GT GGAT 7 03 

Qy 481 AT G CACAT T T CT GT CAT CAT CT CT GCACT CATT GCCACT CT GT ACACAC T GGT G G GAGGG 540 

M Mil M Mill I M I M I M M M M II M I M M M M M M M I 

Db 704 GTGAACATATCGGTCATTGTCTCCGCACTCATTGCCATTCTTTATACCCTCGTGGGAGGG 763 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

II II I II M M I I M I I M M II II I M I I M M I I I II M I M I I I II I 

Db 764 CT CTACT CT GT GGCATATACT GAT GT T GTACAGCTATT CT GCAT T T T TATAGGAT T GT GG 823 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

II I I I I I II I Mill I I I I I II I II I I I I I I I I Mill II I I I M I M I I I I 
Db 824 AT CAGT GT C C CAT T T GCC CT GT CACAT C CT GCAGT CAC CGACAT T GGAT T CACT GCT GT G 883 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

I I I M I I I I I I I I I M I M I M I II M MM Ml I II I I M I II I II I 

Db 884 CAT GCT AAAT ACCAGAGT CC CTGGCT GGGAACCATT GAAT CAGTT GAAGT CT ACACCT GG 943 

Qy 721 CT T GATAGT T TT CT GTT GTT GAT GCT G GGT GGAAT CCCAT GGCAAGCAT ACT T T CAGAGG 780 

M M I M M II I II I II I I II I I I I I I I II II M I II I I II I II I Mill I II I II 

Db 944 CTTGATAATTTTCTGTTGTTGATGCTGGGTGGAATACCATGGCAAGCCTACTTCCAGAGG 1003 

Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

II M M I II I II Mill I I I I II I M I I I II M I M I M I M I II M II M Ml 

Db 1004 GTCCTCTCTTCATCGTCAGCGACCTATGCTCAGGTGCTGTCCTTCCTGGCAGCTTTTGGG 1063 

Qy 841 T GCCT GGT GAT GGC CAT CCCAGCCATACT CATT GGGGCCATT GGAGCAT CAACAGACT GG 900 

I I II II II II II II I M I I II I I II I I I M I I II M I I I II II II I II I II I 

Db 1064 TGCCTGGTGATGGCTCTACCAGCCATTTGCATTGGGGCCATTGGAGCCTCCACAGACTGG 1123 

Qy 901 AAC CAG ACT GCAT AT GGGCT T C C AGAT C C C AAGACT AC AGAAGAGG C AGAC AT GAT T T T A 960 

Mill II I M M I M I I M M M M II M M M I M II II M II M M I I I 

Db 1124 AACCAAACT GCAT AT GGGTTT CCAGAT CCCAAGACCAAGGAGGAAGCAGACAT GATT CT C 1183 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II II I II II I Mill II I I I M I M M II Mill II II I II I II II II II III 

Db 1184 CCGATTGTTCTACAGTACCTCTGCCCTGTGTACATTTCCTTCTTTGGGCTTGGTGCTGTT 1243 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II I II II I I I I Mill II M II II II I II I I I M I I I I I I I I III I I I I I I 

Db 1244 TCTGCTGCTGTCATGTCCTCGGCTGACTCATCCATCCTATCAGCAAGTTCCATGTTTGCT 1303 

Qy 1081 C GGAAC AT CT AC C AGC T T T C CT T CAG AC AAAAT G C T T C GGAC AAAG AAAT CGT T T G GGT T 114 0 

Mill I I M I I I I M I M I I I M M I II I I M II M Mill I M M M I M M 

Db 1304 CGGAAT AT CT ACCAGCTTT CCTT CAGACAAAAT GCAT CAGACAAGGAAATT GT GT GGGT C 1363 

Qy 1141 AT GCGAAT CACAGT GTTT GT GTTT GGAGCAT CT GCAACAGCCAT GGCCTT GCT GAC GAAA 1200 

III I II I I I I I I I II II II I I I II II I I M II II M II I II I I II I II I I Mill 

Db 1364 AT GAG GAT CACT GT GTTT GT GT TT G GAGCATCT GCAACAG C CAT GG C CT T GCT CACGAAG 1423 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

II I II II I I II I II M I M II II II I II I II II II I I I II II I II II II I I III 



Db 



1424 ACT GTGTAT GGGCT CT GGT ACCT GAGCTCTGACCTT GT CTACAT CAT CATCTT C CCACAG 1483 



Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II II I I I I I I I I 
Db 1484 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1543 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II Ml 

Db 1544 TTTGGACTTTTCCTGAGAATTACCGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1603 

Qy 1381 T T CT AC C CT GGCT AT T AC CCT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1604 T T CT AC CCTGGTTAT T AC CCT GAC AAG AAT G G TAT AT AC AAT C AG AGGT T C C CAT T T AAA 1663 

Qy 1441 AC ACT T GC CAT GGT T AC AT CAT T CT T AACCAAC AT T T GC AT CT C CT AT CT AGC C AAGT AT 1500 

II II I I II I I I I I I II I I II I I I I I I I I I I I I I I I II I I II I I I I I I I I I II 

Db 1664 ACT CT CT C CAT G GT T AC C T CAT T CT T T AC C AAC AT T T GT GT T T C CT AT CT AGC C AAGT AT 1723 

Qy 1501 CT AT T T GAAAGT GGAAC CT T GC C AC CT AAAT TAG AT GT AT T T GAT GCT GT T GT T GC AAGA 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MM 

Db 1724 C TAT T T GAAAGT GGAAC C T T GC C T C C AAAAT T AGAT AT AT T T GAT GCTGTTGTCT C AAG G 1783 

Qy 1561 C ACAGT GAAGAAAAC AT G GAT AAGACAATT CTT GT C AAAAAT GAAAATATTAAAT T AGAT 1620 

I II I I M II M I II II II I Mill Mill MM I I I M M M II M I II I II 

Db 1784 C AC AGT GAAG AGAAC AT G G AC AAGAC CAT T C T AGT C AGAAAT GAAAACAT C AAAT T AAAT 1843 

Qy 1621 GAAC T T GCACT T GT GAAGC C AC GACAGAGC AT GAC C CT C AGCT CAACT T T CAC CAAT AAA 1680 

II I I I II II I III Mill II II I II II I II II II II M II II M II II II M M 

Db 1844 GAACTTGCACCTGTAAAGCCTCGACAGAGCCTAACCCTCAGTTCAACTTTCACCAAT7WV 1903 

Qy 1681 GAGGCCTTCCTT GAT GTT GATTCCAGTCCAGAAGGGTCT GGGACT GAAGATAATTTACAG 1740 

I I I II II II II II M II II II II II I I I II II I II I M II M M I M M Mill 

Db 1904 GAGG CT CTCCT T GAT GT T GAT T C CAGT C CAGAGGGAT CT GGGACT GAAGAT AACT T ACAA 1963 

Qy 1741 TGA 1743 

I I I 

Db 1964 TGA 1966 



RESULT 4 

US-09-911-077A-3 

; Sequence 3, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT: 008US 

; CURRENT APPLICATION NUMBER: US/ 09/ 911 , 07 7A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS: 27 

; SOFTWARE: Patent In Ver. 2.1 

; SEQ ID NO 3 

; LENGTH: 1743 

TYPE: DNA 
; ORGANISM: Mus musculus 



FEATURE : 
NAME/ KEY: CDS 
LOCATION: (1) . . (1743) 
US-09-911-077A-3 

Query Match 78.9%; Score 1375; DB 10; Length 1743; 

Best Local Similarity 86.8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0; 
Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I III I MINIM II M II III 

Db 1 ATGCCTTTCCATGTGGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 12 0 

II I II M II II I I I I I I Mil II II II II M M MM I II M M I I I I I II III 

Db 61 GTT GGAATAT GGGCT GCATGGAAAACCAAAAACAGCGGCAACCCAGAAGAGCGCAGT GAA 12 0 

Qy 121 GCCAT CATAGTT GGT GGCCGAGATAT T GGTTTATT GGTT GGT GGATTT ACCAT GACAGCT 18 0 

II II I M II II II Mill II M II II I I II I M M I I M M II II M I M M I 

Db 121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 18 0 

Qy 181 ACCT GGGT C GGAGGAGGGT AT AT CAAT GGCACAGCT GAAGC AGTTT AT GTACCAGGTT AT 240 

I M II I M I I M I M I M I I II M II Mill I II M M I I II I M I M M I 

Db 181 ACCT GGGT T GGAGGAGGCT ACAT CAAT GGGAC AGCAGAAGC AGT GT AT GGGC CAGGT T GT 240 

Qy 241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I I I M M II I M I II I Mill II M I M I II I II I M M I M I M II II II Ml 

Db 241 GGTCTAGCTTGGGCTCATGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

Qy 301 TTCTTTGCAAAACCTATGCGTTC7WVGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

M II M I II II I II II M I M M I M II II M M II M M II I II M Ml 

Db 301 TTTTTTGC GAAAC CT AT G C GT T C C AAG GGAT AT GT GAC TAT GT T AGAC C CAT T C AAAC AG 360 

Qy 361 AT CT AT GGAAAAC GC AT GGGC GGACT C CT GT T TAT T C CT GCACT GAT GGG AG AAAT GT T C 420 

II I 11 II M M II M M M M M M M M M II II II M M I M II I II I M 

Db 361 AT CT AT GGAAAGC GCAT GGGT GGGCT G CT CT T CAT C CCT GCACT GAT G GGAGAGAT GTT C 420 

Qy 421 TGGGCTGCAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGAT 480 

M M M M II II M II II II II I II II M M II M I I M M II M II II M I M II 

Db 421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 480 

Qy 481 AT GCACAT TT CT GT CAT CAT CT CT GCACT CATT GCC ACT CT GTACACACT GGTGGGAGGG 540 

I I II M M II M I II I M I! II II II II II I I II II II M M M I I II 

Db 481 GT G AAC AT AT C G GT CAT T GT CT CT GCACT CAT T GC C AT T CT T T AT AC C CT AGT GGGT G GG 540 

Qy 541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

(MM I M I M M M M M M II M Mill II I M II I I M MM M I I II 

Db 541 CTCTACTCTGTGGCATATACTGATGTTGTCCAGCTATTCTGCATTTTTATAGGACTGTGG 600 

Qy 601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

Mill II I II I II II M I M M M M II M II I M I M II I Mill M M M 

Db 601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

Qy 661 CATGCCAAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGG 720 

M II I M M M M I II M M M M I M M M M I M M M M M I I M 

Db 661 CATGCTAAATACCAGAGTCCCTGGCTGGGAACCATTGAATCAGTTGAAGTCTACACCTGG 72 0 



Qy 721 CTTGATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGG 780 

I I I I I I I MINIMI II I II I I I I I Ml II I I M I II I I I M M Mill II II I I 

Db 721 CTTGATAATTTTCTGTTATTGATGCTGGGTGGAATCCCATGGC7^AGCCTACTTCCAGAGG 780 



Qy 781 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 840 

I I M I I I I I I I II M I I I M M M M I I M II I I I M I I M I I M I I I II M II I 

Db 781 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 84 0 

Qy 841 TGC CT GGT GAT GGC C AT C C C AGC CATACTCATT GGGG C CATT GGAGCAT C AAC AGACT GG 900 

II II II M I I I II I I II I I I I I I II I M II M I I II II M I I I I I II II 

Db 841 T GC CT GGT GAT GGCT CTACCCGCCATAT GCAT AGGAGCT ATT GGAGCT T CCACAGACT GG 900 

Qy 901 AACCAGACT G CATAT GGGCT T CCAGAT C C C AAGACT ACAGAAGAGGCAGACAT GAT T TT A 960 

I I II II II II I II I II I M II I I II II I I I I II I I II II I II II I I II I I 

Db 901 AAC C AG AC TGC C T AC G G GT AT C C AG AT C C C AAG AC T AAG GAG G AAG C AG AC AT GAT T C T C 960 

Qy 961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

I I I I II I M I M M I I I M M M M M II II II I I I I I I II II II M II III 

Db 961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 

Qy 1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

II II I I II I I I II II II I II II II I II I I I MM II II II I I I II I II I 

Db 1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 

Qy 1081 C GGAAC AT CT AC C AGCT T T C CT T C AGAC AAAAT GC T T C G GACAAAGAAAT C GT T T G GGT T 1140 

Mill I M I I M I I I II I II I II I II M I I I I I I M Mill II II I II Mill 

Db 1081 C GGAAT AT CT AC CAGCT T T C CT T CAGAC AAAAT GCAT C AGACAAGGAAAT T GT GT G G GT C 1140 

Qy 1141 ATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAA 1200 

Ml I Mill III M II M I II M II I II II II II II II II II II II II II II I 

Db 1141 AT GAGGAT CACT GT GCTT GT GTT CGGAGCAT CT GCAACAGC CAT GGCT T T GCT GACGAAG 1200 

Qy 1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 

II II II I II II I I II II II II II II II II I II II II II II M I II II I II I III 

Db 1201 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1260 

Qy 1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

M I M II M II II II I I II II II I I II M I II II II II II II I II I II II 

Db 1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

Qy 1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 138 0 

I Ml II II I II II II II M M II II I II I II M I II II I II I M II I III 

Db 1321 TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1380 

Qy 1381 TT CTACC CT GGCT AT T ACC CT GAT GAT AAT GGT AT AT AT AAT CAGAAAT T T C CAT T T AAA 1440 

II II I I II M I II II II MM I I I II II II II I II II II I II II II I I I II 

Db 1381 TT CTAC C CT G GT TAT TACT CT GACAAGAAT G GT ATATACAAT CAGAGGT T C C CAT T TAAA 144 0 

Qy 1441 ACACTTGCCATGGTTACATCATTCTTAACCAACATTTGCATCTCCTATCTAGCCAAGTAT 1500 

II II II II II II II II II I I I I II II II II M I I II II II II II II II I II 

Db 1441 ACT CT CT C CAT GGTT AC CT CAT T CT T T AC C AAC AT T T GT GT T T CT T AT CT AG C C AAGT AT 1500 

Qy 1501 CTATTTGAAAGTGGAACCTTGCCACCTAAATTAGATGTATTTGATGCTGTTGTTGCAAGA 1560 

II I II I I II M II II M II II II M II II II II II II II I II II II II I II Mill 

Db 1501 CT AT T T GAAAGT GGAAC CT T G CCT C C AAAAT T AGAT GT AT T T GAT GCT GT T GT C G CAAGG 1560 

Qy 1561 CACAGT GAAGAAAACAT GGAT AAGACAAT T CT T GT CAAAAAT GAAAAT AT T AAATTAGAT 1620 



I I I I I I II I I I I II I I II I II I I I I I I II III! I I I I I I I I I I I I I I I I I I M 

Db 1561 C AC AGT GAAGAGAAC AT GG AC AAGAC CAT T CT AGT C AGAAAT GAAAAT AT C AAAT T AAAT 1620 

Qy 1621 GAACTT GCACTT GT GAAGCCAC GACAGAGCAT GAC C CT C AGCT CAACTTT CAC C AAT AAA 1680 

I II I I I II I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1621 GAAC T T GC AC C T GT GAAAC C T C GGC AGAGC C TAAC C CT C AGT T CAAC T T T CAC C AAT AAG 1680 

Qy 1681 GAGGCCT T CCT T GATGTT GATT CCAGT CCAGAAGGGT CT GGGACT GAAGAT AAT TT ACAG 1740 

I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAATTTACAA 1740 

Qy 1741 TGA 1743 

I I I 

Db 1741 TGA 1743 



RESULT 5 

US-09-911-077A-23 

; Sequence 23, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 23 

LENGTH: 1743 
; TYPE: DNA 

ORGANISM: Mus mus cuius 
; FEATURE : 

NAME/ KEY: CDS 

LOCATION: (1) . . (1743) 
US-09-911-077A-23 

Query Match 78.9%; Score 1375; DB 10; Length 1743; 

Best Local Similarity 86.8%; Pred. No. 0; 

Matches 1513; Conservative 0; Mismatches 230; Indels 0; Gaps 0; 



Qy 1 AT GG CT T T C CAT GT GGAAGGACT GAT AGCT AT CAT C GT GTT CT AC CT T CT AAT TTT GCT G 60 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I II II II Ml 

Db 1 ATGCCTTTCCATGTGGAAGGACTGGTAGCTATTATCCTCTTCTACCTCCTTATATTTCTG 60 

Qy 61 GTT GGAAT AT GGGCT GCCT GGAGAACCAAAAACAGT GGCAGC GCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I II II I I II II II I III 
Db 61 GTTGGAATATGGGCTGCATGGAAAACCAAAAACAGCGGCAACCCAGAAGAGCGCAGTGAA 120 

Qy 121 GCCAT CATAGTT GGT GGCCGAGATATT GGTT T AT T GGTT GGT GGATTTAC CAT GACAGCT 180 

I I I I I I I I I I I II INN II I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 121 GCCATCATAGTCGGGGGCCGTGACATTGGTTTGTTGGTTGGTGGTTTTACCATGACAGCC 180 



Qy 181 ACCT GGGT CGGAGGAGGGTATAT CAAT GGCACAGCT GAAGCAGTTTATGTACCAGGTTAT 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 



181 ACCTGGGTTGGAGGAGGCTACATCAATGGGACAGCAGAAGCAGTGTATGGGCCAGGTTGT 240 

241 GGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTG 300 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

241 GGTCTAGCTTGGGCTCATGCACCCATTGGATATTCTCTGAGTCTAATTTTAGGTGGTCTG 300 

301 TTCTTTGCAAAACCTATGCGTTCAAAGGGGTATGTGACCATGTTAGACCCGTTTCAGCAA 360 

| | Mill I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I II I II 
301 TTTTTTGC G AAAC C T AT G C GT T C C AAG G GAT AT GT G ACT AT GT TAG AC C CAT T C AAAC AG 360 

361 ATCTATGGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTC 420 

| | | | | I I M II I I I I I I I I M M M M M I I I I I I I I I I I M I I I I I I I I I I 
361 AT CT AT G GAAAGC G CAT GGGT GG GCT GCT CT T CAT C C CT GCACT GAT GGGAGAGAT GT T C 420 

421 T GG GCT GCAG CAATTTT CT CT GCT TT GGGAGC C ACCAT CAG C GT GAT CAT C GAT GT GGAT 480 

| | I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I i I I I I I I I I I I I I I I M I 

421 TGGGCTGCAGCAATTTTCTCTGCATTAGGGGCCACCATCAGCGTGATCATTGATGTGGAT 4 80 

481 AT GCACATTTCTGTCATCATCTCT GCACT CATTGCCACTCTGTACACACTGGTGGGAGGG 540 

|| | | | | || | | I I I I I I I I I I I I I I I M I I I I I II II M M I I I I I IN 

481 GTGAACATATCGGTCATTGTCTCTGCACTCATTGCCATTCTTTATACCCTAGTGGGTGGG 54 0 

541 CTCTATTCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGG 600 

I I I I I I I I I I I I I I I I I I M INN II MINIMI MM II I II I 

541 CT CT ACT CT GT GG CAT AT ACT GAT GT T GT C CAGCT AT T CT GCAT TT T T AT AGGACT GT G G 600 
601 ATCAGCGTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTG 660 

I || II II I I I I II I I II M I M II II I M II I I I II II II I II II I MUM 

601 ATCAGTGTCCCTTTTGCCCTGTCACATCCTGCAGTCACCGACATCGGATTCACAGCTGTG 660 

661 CAT GC CAAAT AC CAAAAGC CGT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CT T GG 720 

|| | || || | || || I I II II II I II M I I II I I I I I II M I M II I I II I 
661 CAT GCT AAAT AC CAGAGT C C CT GGCT GGGAAC CAT T GAAT CAGTT GAAGT CT ACAC CT GG 72 0 

721 CTT GAT AGTTTT CT GT T GTT GAT GCT GGGT GGAAT CCCAT GGCAAGCAT ACTTT CAGAGG 78 0 

| | | M I I I II I I II M II II I I I I M II I I II II I I M II I I I I I Mill II II II 
721 CTT GATAAT TTT CT GT TAT TGAT GCT GG GT GGAAT C C C AT GGCAAG C CT ACT T C CAGAG G 780 

7 81 GTTCTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGG 84 0 

|| || || I I II I II I II I N II II I II I N I I I I I M IN 

7 81 GTCCTCTCTTCATCCTCAGCCACCTATGCTCAGGTACTGTCCTTCCTGGCAGCTTTTGGG 840 

841 TGCCT GGT GAT GGCCAT CCCAGCCATACT CATT GGGGCCATT GGAGCAT CAACAGACT GG 900 
| MUM I II II M M II M I II II I M I II II II I I 

841 T GC CT GGT GAT GGCT CT AC C C G C CAT AT G C AT AGGAGCT AT T GG AG CT T C C AC AGACT GG 900 

901 AACCAGACTGCATATGGGCTTCCAGATCCCAAGACTACAGAAGAGGCAGACATGATTTTA 960 

| | M I I II II I II III II I I I II II II I M M I M M I I M M I M I I I I 

901 AACC AGACT G C CT AC GGGT AT C CAGAT C C CAAGACTAAGGAGGAAG C AGAC AT GAT T CT C 960 

961 CCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTT 1020 

II || II Ill IMMMM I MINIM I II I II I I IN 

961 CCGATCGTTCTGCAGTACCTCTGCCCTGTGTACATCTCCTTCTTTGGGCTTGGTGCTGTT 1020 
1021 TCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCA 1080 

M M I M II I I I I II II II I II M I 1 I I I I NN II I II I I MINIM 

1021 TCAGCTGCTGTCATGTCCTCAGCTGACTCGTCCATCCTGTCGGCGAGTTCTATGTTTGCT 1080 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



1081 CGGAACAT CTACCAGCTTT CCTT CAGACAAAAT GCTTCGGACAAAGAAAT CGTTT GGGTT 1140 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I I I I II I I I M 
1081 CGGAATAT CTACCAGCTTT CCTT CAGACAAAAT GCAT CAGACAAGGAAATT GT GT GGGT C 1140 

1141 AT GC GAAT CACAGT GTTT GT GT T T GGAGCAT CT GCAACAGC CAT GGCCT T GCT GAC GAAA 1200 

III I I I I II III I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 AT GAG GAT CACT GT GCTTGT GT T C G GAGCAT CT GCAACAGC CAT GGCT T T GCT GAC GAAG 1200 

1201 ACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAG 1260 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MINIM Ml 

1201 ACTGTGTATGGGCTCTGGTACCTGAGCTCTGACCTTGTCTACATCATCATCTTCCCACAG 1260 
1261 CTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTT 1320 

Mill II I M M II II I M M I II I II I M M II M II M II I M I I I II 

1261 CTGCTCTGTGTACTCTTCATCAAAGGAACCAACACTTATGGGGCAGTTGCTGGTTATATT 1320 

1321 TCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATC 1380 

I III M I I II II I II II II I II M I II I I II II II I II 111111111 Ml 

1321 TTTGGACTATTCCTGAGAATTACTGGAGGAGAGCCATATCTATACTTGCAGCCCTTAATC 1380 

13 81 T T CT AC C CT GGCT AT T AC C CT GAT GAT AAT G GT AT AT AT AAT C AGAAAT T T C CAT T T AAA 1440 

I II II II II II II II I I MM I II II I I I II II M M II I M II I II I I II 

1381 TTCTACCCTGGTTATTACTCTGACAAGAATGGTATATACAATCAGAGGTTCCCATTTAAA 1440 
1441 AC ACT T GC CAT GGTT ACAT CAT T CT T AAC CAAC AT T T GCAT CT C CT AT CT AGC CAAGT AT 1500 

I I I I M I I I I M I I M M M M I I M I II II M I I I M M II II M M I I I 

1441 ACT CT CT C CAT G GT T AC C T CAT T CT T T AC CAAC AT TTGTGTTT CT T AT CT AG C CAAGT AT 1500 

1501 CTAT T T GAAAGT G GAAC C TT GC C AC C TAAAT TAGAT GTAT T T GAT GC T GT T GT T GC AAGA 1560 

II II I II I I I M I M II I II II I M M II I I I M M I M M I M I I II I I I Mill 

1501 CTATTTGAAAGTGGAACCTTGCCTCCAAAATTAGATGTATTTGATGCTGTTGTCGCAAGG 1560 
15 61 CACAGT GAAGAAAAC AT G GAT AAGAC AAT T CT T GT CAAAAAT G AAAAT AT TAAAT TAGAT 1620 

II II I II I I I I M M II II Mill Mill MM II II II M II II I II I II M 

1561 CACAGT GAAGAGAACAT G GACAAGAC CAT T CT AGT CAGAAAT GAAAAT AT CAAAT TAAAT 1620 

1621 GAACTTGCACTTGTGAAGCCACGACAGAGCATGACCCTCAGCTCAACTTTCACCAATAAA 168 0 

II II I II II I II M II II II II II I I I M I I II II I II I II II I II I I M M 

1621 GAAC T T G C AC CT GT G AAAC CT C GG C AGAG C CT AAC C CT C AGT T CAAC T T T CAC C AAT AAG 1680 

1681 GAGGC CT T CCTT GAT GT T GAT T C C AGT C CAGAAG GGT CT GG GACT GAAGAT AAT T T ACAG 1740 

I II I M I I I II II I II II II II II II I I M M M II M M I II II II I I II I II M 

1681 GAGGCCCTCCTTGATGTTGATTCCAGTCCGGAGGGGTCTGGGACTGAAGATAATTTACAA 1740 

1741 TGA 1743 
I I I 

1741 TGA 1743 



RESULT 6 

US-09-911-077A-19 

; Sequence 19, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: AP PARSUNDARAM , SUBRAMANIAM 



APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/09/911, 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS: 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 19 

LENGTH: 119040 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: modif ied_base 
LOCATION: (2347) . . (90873) 
OTHER INFORMATION: N = A, C, G or T/U 
US-09-911-077A-19 



Query Match 36.2%; 
Best Local Similarity 99.7%; 
Matches 632; Conservative 



Score 630.8; DB 10; 
Pred. No. 4.9e-178; 
0; Mismatches 2; 



Length 119040; 



Indels 



0; Gaps 



0; 



Qy 1110 AAAT G C T T C G G AC AAAGAAAT CGTTTGGGT TAT G C G AAT C AC AGT GTTTGTGTTTG GAG C 1169 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 30755 ACAGGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 30814 

Qy 1170 ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 30815 ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 30874 

Qy 1230 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 12 89 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 30875 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 30934 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1290 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 1349 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

30935 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 30994 

1350 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 14 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

30995 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 31054 

1410 T GGT AT AT AT AAT C AGAAAT T T C CAT T T AAAAC ACT T GC C AT GGT T AC AT CAT T CT T AAC 1469 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

31055 TGGTATATATAATCAGAAATTTCCATTTAAAACACTTGCCATGGTTACATCATTCTTAAC 31114 

1470 CAACATTT GCAT CT C CTATCTAGCCAAGTAT CT ATTT GAAAGT GGAACCTT GCCACCTAA 1529 

M I I M I M II I M M M I I M M M I I M M M I I M I I I M M II II M I I II II I I I 

31115 CAACATTT GCAT CT C CTATCTAGCCAAGTAT CT ATTT GAAAGT GGAACCTT GCCACCTAA 31174 

1530 AT TAG AT G TAT T T GAT GCTGTTGTTG C AAG AC AC AG T GAAG AAAAC AT G G AT AAG AC AAT 1589 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

31175 AT TAG AT GT AT T T GAT GCTGTTGTTG C AAG AC AC AGT GAAG AAAAC AT G GAT AAGAC AAT 31234 

1590 T CT T GT C AAAAAT GAAAAT AT T AAAT TAG AT G AACT T GC ACT T GT GAAGC C AC GAC AGAG 1649 
I II I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

31235 T C T T GT CAAAAAT GAAAAT AT T AAAT T AGAT GAAC T T G C AC T T GT GAAG C C AC GAC AGAG 31294 



Qy 1650 CAT GAC CCT CAGCT CAACT T T CAC C AAT AAAGAGGC CTT CCTT GAT GT T GAT T C CAGT CC 17 09 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 31295 CATGACCCTCAGCTCAACTTTCACCAATAAAGAGGCCTTCCTTGATGTTGATTCCAGTCC 31354 

Qy 1710 AGAAGGGT C T GGGAC T GAAGAT AAT T T ACAGT GA 1743 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 31355 AGAAGGGT CTGGGACTGAAGATAATTT ACAGT GA 31388 



RESULT 7 

US-09-911-077A-14 

Sequence 14, Application US/09911077A 
Publication No. US20030114399A1 
GENERAL INFORMATION: 
APPLICANT: BLAKELY, RANDY D. 
APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/ 09/ 911 , 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS: 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 14 

LENGTH: 142299 
TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Description of Artificial Sequence: Synthetic 
OTHER INFORMATION: Primer 
FEATURE : 

NAME /KEY: modif ied^base 
LOCATION: (1305) . . (127835) 
OTHER INFORMATION: N = A, C, G or T/U 
US-09-911-077A-14 

Query Match 36.2%; Score 630.8; DB 10; Length 142299; 

Best Local Similarity 99.7%; Pred. No. 5.6e-178; 

Matches 632; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1110 AAAT GC T T C GGAC AAAGAAAT C GT T T GGGT T AT GC GAAT C ACAGT GT T T GT GT T T GGAG C 1169 

I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 94673 ACAGGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGC 94732 

Qy 1170 AT CT GCAACAGCCAT GGCCTT GCT GACG7VAAACT GT GT AT GGGCT CT GGTACCT CAGTT C 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 94733 ATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTC 94792 

Qy 1230 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 1289 

I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 94793 TGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAAC 94852 

Qy 1290 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 1349 

I I I I I I I I I I M 1 1 I 1 1 I I I II M I I 1 1 I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I 

Db 94853 CAACACCTATGGGGCCGTGGCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGG 94912 



Qy 



1350 GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 1409 



Db 


94913 


Qy 


1410 


Db 


94973 


QY 


1470 


Db 


95033 


Qy 


1530 


Db 


95093 


Qy 


1590 


Db 


95153 


Qy 


1650 


Db 


95213 


Qy 


1710 


Db 


95273 



I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GGAGCCATATCTGTATCTTCAGCCCTTGATCTTCTACCCTGGCTATTACCCTGATGATAA 94972 

T GGT AT AT AT AAT C AGAAAT T T C CAT T T AAAAC ACT T G C CAT GGT T AC AT CAT T C T T AAC 14 69 
| | II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
T GGTAT ATAT AAT CAGAAAT TT C CAT T T AAAACACTT GC C AT GGT T ACAT C ATT CTT AAC 95032 

CAACATTT GCAT CT CCT AT CT AGCCAAGT AT CT ATTT GAAAGT GGAACCT T GCCACCT AA 1529 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAACATTTGCATCTCCTATCTAGCCAAGTATCTATTTGAAAGTGGAACCTTGCCACCTAA 95092 

AT T AGAT GT AT TT GAT G CT GT T GTT G CAAGACAC AGT GAAGAAAAC AT GGATAAGACAAT 158 9 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I 

AT T AGAT GT ATTT GAT GCT GT T GT T GCAAGACACAGT GAAGAAAAC AT GGATAAGACAAT 95152 

T CT T GT C AAAAAT GAAAAT AT T AAAT T AGAT GAACT T GCACT T GT GAAG C C AC GAC AGAG 1649 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T CT T GT CAAAAAT GAAAAT AT T AAAT T AGAT GAACT T GCACT T GT GAAGC C AC GAC AGAG 95212 

CAT GAC CCT C AG CT CAACT T T C AC CAAT AAAGAGGC CT T C CT T GAT GT T GAT T C C AGT C C 17 09 

I | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

CATGACCCTCAGCTCAACTTTCACCAATAAAGAGGCCTTCCTTGATGTTGATTCCAGTCC 95272 

AG AAG G GT C T GG GACT GAAGAT AAT T T AC AGT G A 1743 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



RESULT 8 
US-10-241-784-1 

; Sequence 1, Application US/10241784 
; Publication No. US20040048261A1 
; GENERAL INFORMATION: 
; APPLICANT: Bayer Corporation 

; TITLE OF INVENTION: Invertebrate Choline Transporter Nucleic Acid, 

Polypeptides and Uses 

; TITLE OF INVENTION: Thereof 

FILE REFERENCE: M07218 
; CURRENT APPLICATION NUMBER: US/10/241, 784 
; CURRENT FILING DATE: 2002-09-11 
; NUMBER OF SEQ ID NOS : 2 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 1 
; LENGTH: 1833 
TYPE: DNA 

ORGANISM: Drosophila melanogaster 
FEATURE: 
NAME /KEY: CDS 
LOCATION: (1)..(1833) 
OTHER INFORMATION: 
US-10-241-784-1 

Query Match 21.6%; Score 376.6; DB 12; Length 1833; 

Best Local Similarity 56.0%; Pred. No. 2e-102; 

Matches 868; Conservative 0; Mismatches 624; Indels 57; Gaps 6; 



Qy 



8 TCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTGGTTGGAA 67 



5 TCAATATCGCTGGCGTGGTGAGCATCGTGCTCTTCTACCTCCTGATCCTGGTCGTTGGCA 64 

68 TAT GGGCTGCCTG G AGAAC C AAAAAC AGT G G C AG C G C AG AAG AG C G C AG C GAAG C CAT C A 127 

I I I I I I I III I M I M Ml I IN 

65 TTTGGGCCGGTCGCAAGAAGCAGTCCGGCAATGATTCGGAGGAG GAGGTCA 115 

128 TAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGCTACCTGGG 187 

I II MM I I I I I M I I M III IMIIMIM 

116 TGCTGGCCGGACGCTCCATCGGCCTCTTCGTGGGCATCTTCACCATGACGGCCACCTGGG 17 5 

188 TCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTAG 247 

I I I I I 1 I M I I I I I I I II I M I I II Ml I I I I I Ml 

176 TGGGTGGCGGCTACATCAACGGCACGGCGGAGGCTATATACACATCGGGT CTGG 229 

248 CTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTG 307 

Ml I M I I I I I I I I I M I I I M I II I I I I I I I I I I I 

230 TGTGGTGCCAGGGTCCATTTGGATACGCTCTAAGCTTGGTATTTGGTGGCATCTTCTTTG 289 

308 C AAAAC CT AT G C GT T C AAAG G G GT AT GT GAC CAT GT T AGAC C C GT T T C AG C AAAT CT AT G 367 

I | | M II M I I I II I I I II I I I II I I I I I I I I I I I I M I I 

290 C CAAT C C CAT GC GCAAGCAGGGT T AC AT CAC CAT GT T G GAT CC GTT GCAGGATT C CT T T G 349 

368 GAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTG 427 

I I II I II II I M M I I Mill! 

350 GTGAGCGGATGGGAGGATTGCTCTTCCTGCCCGCTCTATGCGGTGAGGTCTTTTGGGCAG 409 

428 CAGCAATTTTCTCTGCTTTGGGAGCCACCATCAGCGTGATCATCGATGTGGATATGCACA 487 

II || I I I II I II M II I I II I I I I I I II I Mill III 

410 CCGGCATCCTGGCTGCACTTGGCGCCACTCTATCGGTGATCATCGACATGGATCACCGCA 469 

488 TTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTATT 547 

M I I II I I I I I I I I I II I I I II II II I I I I I M M II I 

470 CCTCGGTGATCCTGTCCTCCTGCATCGCCATCTTCTACACACTGTTCGGTGGACTGTACT 529 

54 8 CTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGGATCAGCG 607 

I I I I I I .11 II I I I I I II I I II M M I M I II I I II M M M 

530 CCGTGGCGTATACGGACGTGATCCAGTTGTTCTGCATCTTCATCGGTCTGTGGATGTGCA 589 

608 TCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTGCATGCCA 667 

I I I I M I I I I I M I III I I II I II 

590 TTCCCTTCGCCTGGAGCAACGAGCACGTGGGCAGCCTGAGTGACCTGGAGGTGGAT 64 5 

668 AAT AC C AAAAGC CGTGGCTGG GAACT GT T GACT C AT CT GAAGT CT ACT CT T GG CT T GAT A 727 

II I I I I II II I I I I IM 

64 6 TGGATTGGGCACGTGGAGCCTAAAAAGCATTGGCTGTACATAGACT 691 

728 GTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTCT 787 

I II II I M M I I M M M II II II M II II II I 

692 ACGGCTTGCTGCTCGTCTTTGGTGGCATTCCCTGGCAGGTCTACTTCCAGCGGCAAAAC- 750 

78 8 CTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTGG 847 

I I I M II I III II II II II II II I 

75i GGCAGGAAGGGCCCAGCTTCTGCCTATGTTGCAGCCGCCGGATGCATTT 799 

84 8 T GAT GGC C AT C C CAGCCAT ACT CAT T GGGGCC ATT G GAGCAT CAAC AGACT GGAAC C AGA 907 
M II II II II II I II I II I I I I II II I M I I I I II II I I I 



DD 


ft nn 


Trz\Trr^rr ATTrrrrr^GTfirTCATrGGAGCGATTGCCAAGGCTACACCTTGGAACGAGA 


859 




QDft 


CT GCAT AT G GG CT T C CAG AT C C C AAGACT AC AGAAGAG GC AGAC AT GAT T T T AC C AAT T G 
III 1 I 1 1 1 1 II III 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 
r A G ATT A r A A GG G AC C CT AT C CC CT GAC C GT GGACGAGAC GAGCAT GATT CT GC C C AT GG 


967 


DD 


ft £0 

O DU 


919 


Qy 


Q£ft 


TTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTTTCTGCTG 
I I M 1 ! 1 1 1 III MM 1 1 M 1 M M M 1 1 M M M M 1 M 1 

fprrTrr APT ArrTrArGrrTGACTTCGTGTCCTTrTTTGGATTGGGCGCTGTTTCCGCCG 


1027 


Db 


yz u 


979 


Qy 


lUZo 


CTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGTTCCATGTTTGCACGGAACA 

I M M M 1 M M M M M 1 1 M M 1 M M 1 M M M M M 

rrrTr ATrTrrTrrGrrGArTrrTrGGTGCTCTCCGCCGCCTCCATGTTCGCTCGGAACG 


1087 


Db 


op n 
y o u 


1039 


Qy 


i nop 
lUo o 


TCTACCAGCTTTCCTTCAGACAAAATGCTTCGGACAAAGAAATCGTTTGGGTTATGCGAA 
I M 1 1 1 M 1 1 M M 1 1 M M 1 M M M M M M 1 1 M 1 M 

Tr m Qr a a arrTrATTTTrrnTr AGAAGGrGTCCGAGATGGAAATCATTTGGGTGATGCGAG 


1147 


Db 


1U4 U 


1099 


Qy 


114 o 


TCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCCTTGCTGACGAAAACTGTGT 
Ml 1 M M 1 1 M M M 1 1 M 1 M 1 1 1 II 

TrrrraTrafPTrTrrTrrrtraTrrT^rTArfATfATGGCCCTCACCATTCCCTCCATCT 

X ^LjL^L^/\.X V^/\X ].bl ou X buu^nl X X v^rtx vjvjv^v^v.. x \^.n.v^\-»x-i.x x x lx x 


1207 


Db 


11UU 


1159 


Qy 


i o n q 


ATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAGCTGCTT- 

III | I M I 1 1 I 1 1 II 1 1 M 1 1 1 1 1 M 1 1 1 II 1 1 II 1 1 II 

arrrTTTrTrrTrrDTrTrrTr^ATrTnGTrTArGTrATTCTGTTCCCGCAGCTACTGA 

X X X \D X lj\J X O X Ov> X Uuuril ^ X u>J X V_- ± J- v> J. ui x ^wviv^r^vjv x.rxv^ x vjr i. 


1266 


Db 


1 loU 


1219 


Qy 


lZD / 


--TGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTTTCTG 

II I II 1 III 1 II II 1 1 1 M 1 1 1 1 1 II M 1 

mrrfrrrTrrarTTraarziiirrarTGrAArArGTArGGrAGrrTGTCGGCATACATTGTGG 


1324 


Db 


i o o n 

lzzU 


1279 


Qy 


1 6 A D 


GCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATCTTCT 

| | | M III 1 1 1 M 1 1 1 1 II II M M 1 1 II 1 

rrrrvrrrr ATrrr ArTrTrGGGrGGTGAGGnCATCTTGGGACTGGCTCCATTGATCAAGT 


1384 


Db 


1 OQfl 


1339 


Qy 


1 Job 


ACCCTGGCTATTACCCTGATGATAATGGTATATATAATCAGAAATTTCCATTTAAAACAC 

II M II III IMM 1 1 M M II 

ATrrrrrrTAPGArGAGGAGACCAAGG AGCAGATGTTCCCCTTCCGCACCA 


1444 


DD 


lo4 U 


1390 


Qy 


1 A A R 
14 4 D 


T T GC CAT G GT T ACAT CAT T CT T AAC C AAC AT T T GC AT CT C C TAT CT AG C CAAGT AT CT AT 

1 1 1 I 1 1 1 1 1 III 1 1 1 1 MINI III M 

TGGCCATGCTGCTCAGCCTGGTCACGCTCATCTCGGTCTCCTGGTGGACTAAAATGATGT 


1504 


Db 


1391 


1450 


Qy 


1505 


T T GAAAGT GGAAC CT T GC CACCT AAAT T AGAT GT AT T T GAT GCT GT T GT 1553 

1 II 1 III 1 1 1 1 1 1 1 1 1 M II Mill 

TTGAGTCCGGCAAGTTGCCGCCCAGCTACGACTACTTCCGCTGTGTGGT 1499 




Db 


1451 





RESULT 9 

US-09-911-077A-7 

; Sequence 7 , Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 



CURRENT APPLICATION NUMBER: US/09/911 , 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 7 
LENGTH: 1985 
TYPE: DNA 

ORGANISM: Caenorhabditis elegans 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (9) . . (1739) 
US-09-911-077A-7 

Query Match 20.9%; Score 363.8; DB 10; Length 1985; 

Best Local Similarity 55.1%; Pred. No. 1.6e-98; 

Matches 862; Conservative 0; Mismatches 637; Indels 66; Gaps 5; 

Qy 19 GGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTGGTTGGAATATGGGCTGCC 7 8 

|| | | I I I I Ill I II M I I I I I I I I I II I I I I I I I 

Db 24 GGTATCGTGGCCATTGTGTTCTTCTACGTGCTCATTCTTGTCGTTGGAATATGGGCGGGT 83 



Qy 



7 9 T GGAGAAC CAAAA ACAGT G GCAGC GCAGAAGAGC GC AGC GAAGC CAT C 126 

I I I Mill INN I 

Db 84 AGAAAAT C GAAAAGTT CAAAAGAGCTT GAAT CAGAAGCCGGCGCGGCGACGGAAGAGGTG 143 



Qv 127 AT AGT T G GT GG C C GAGAT AT T G GT T TAT T G GT T G GT G GAT T T AC CAT GAC AG CT AC CT GG 186 

| | | | | I I II I I M II M II I I Ml 

Db 144 ATGTTAGCTGGGAGAAACATCGGAACTCTTGTCGGAATTTTCACAATGACTGCCACGTGG 2 03 

Qy 187 GTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTA 24 6 

M I I I I I I I I I I I I I I I I I i I I I I I I I M I I I M II 

Db 204 GTTGGCGGTGCTTATATCAATGGAACCGCCGAGGCTCTGTATAATGGAGGT CTC 257 

Qy 247 GCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTT 306 

II I I II I I II I I I I II I I I I II M I Ml IN 

Db 258 CTTGGATGTCAGGCTCCAGTTGGATATGCAATTTCCCTTGTTATGGGAGGACTACTTTTC 317 

Qy 307 GC AAAAC CT AT G C GT T C AAAGGGGT AT GT GAC CAT GT T AGAC CC GT T T CAG CAAAT CT AT 366 

I M II I I II I | I I I I I I I I I I I I I I I I I - I I I I I I I I I IN 

Db 318 GCAAAGAAAAT GC GAGAAGAAGGAT AT ATT ACAAT G CT C GAT C CTTT T CAGCACAAATAT 377 

Qy 367 GGAAAACGCATGGGCGGACTCCTGTTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCT 426 

|| | M I I I Ml MM I II I I I I I I I I I I I 

Db 37 8 GGCCAACGAATCGGTGGCTTGATGTATGTTCCAGCACTTCTTGGTGAAACATTCTGGACA 437 

Qy 427 GC AG CAATT T T CT CT GCTTT GGGAGC C AC CAT C AGC GT GAT CAT CGAT GT GGAT AT GCAC 48 6 

|| || I M I I I I II I I I I I I I I MM l> I II I II I 

Db 438 GCAGCCATTCTTTCGGCACTTGGTGCAACACTGTCGGTAATTCTTGGAATCGACATGAAT 497 

Q y 487 ATTTCTGTCATCATCTCTGCACTCATTGCCACTCTGTACACACTGGTGGGAGGGCTCTAT 54 6 

| | M I I I M II II II II I II II M I MM MM 

D b 4 98 GCAT CAGT GACC CT GT CGGCCTGT ATT GC C GT ATT CTACACATT CACCGGT GGAT ACTAT 557 

Qy 547 TCTGTGGCCTACACTGATGTCGTTCAGCTCTTTTGCATTTTTGTAGGGCTGTGGATCAGC 606 

| | | | | M I II I II I M M I I I I I M I II II I I II M M I I I M I M 

Db 558 GCAGTCGCGTACACTGACGTCGTTCAACTATTTTGCATTTTCGTCGGTTTGTGGGTTTGC 617 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 



607 GTCCCCTTTGCATTGTCACATCCTGCAGTCGCAGACATCGGGTTCACTGCTGTGCATGCC 666 

I I I I MM Ml M I MM I M I I 

618 GT G C C G GC GGCT AT G GT G CAT GAT G GT G C GAAG GAT AT T T C CAGGAAT G C AG 669 

667 AAATACCAAAAGCCGTGGCTGGGAACTGTTGACTCATCTGAAGTCTACTCTTGGCTTGAT 72 6 

I I I I M I I Ml I II I II II I 

670 GCGACT GGATT GGAGAGATTGGAGGATT CAAAGAAACAT CT CT CT GGATT GAT 722 

727 AGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTC 786 

I IM I I I I II M I II I II I M M II I I I I I I I II I M II I 

723 TGCATGCTTCTCCTTGTCTTTGGAGGAATTCCATGGCAAGTGTACTTCCAAAGAGTTCTC 782 

787 TCTTCTTCCTCAGCCACCTATGCTC7\AGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTG 84 6 

I I I I III I I I I I II II II I I II I I I II I II I 

783 TCCTCAAAAACTGCTCATGGAGCACAGACGTTGTCGTTTGTGGCGGGCGTCGGATGCATT 842 

847 GT GAT G GC CAT C C C AGC CAT ACT CAT T GG GGC CAT T G GAGC AT C AAC AGACT GG AAC C AG 906 

I I M II II II I I I M II II M II M II II II M I 

84 3 CT CAT GGCGATT C CAC CAGC GT T GAT CGGT G CAAT T GC CAGGAACACAGACT G GAGAAT G 902 

907 ACTGCATATGGGCTTCC AGAT C C CAAGACTACAGAAGAGGCA 948 

II M I I I I Mil I I II II I 

903 AC T GAT TAT T C C C CAT G G AAC AAT G G AAC T AAG GT C G AAT C GAT T C CAC C G GAT AAG AG A 962 

949 GACATGATTTTACCAATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGT 1008 

II II I I III III I I I I I II M II I I III MM 

963 AACATGGTGGTCCCGTTGGTATTCCAGTATCTTACGCCAAGATGGGTCGCCTTTATTGGA 1022 

1009 CTTGGTGCAGTTTCTGCTGCTGTTATGTCATCAGCAGATTCTTCCATCTTGTCAGCAAGT 1068 

M M Mill M I II M M I M II I II I M M M M M I I M M M 

1023 CTCGGCGCAGTGTCGGCTGCTGTAATGTCATCTGCAGATTCATCTGTACTATCAGCAGCA 1082 

1069 T C CAT GT T T GCAC GG AAC AT CT AC CAGCT T T C CT T C AGAC AAAAT G CT T C G GACAAAGAA 1128 

II I I II I I M I I II I II I II I I I I II I I I II II II II II 

1083 T CAAT GT TT GCT CAC AAC AT C T GGAAGC T C ACAAT T C GC C C T CAC GC GT C T GAAAAAGAA 1142 

1129 ATCGTTTGGGTTATGCGAATCACAGTGTTTGTGTTTGGAGCATCTGCAACAGCCATGGCC 1188 

i I I I I I I I I I I I I I II I I Mill II M II 

1143 GT GAT AATT GT GAT GAGAAT AGCCAT CAT CT GT GT T GGTAT CAT GGCAAC CAT CAT GGCA 1202 

1189 TTGCTGACGAAAACTGTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTT 1248 

I I Ml I II M M M M M I II II I M I M Ml M I 

1203 CTTACCATTCAATCCATCTATGGGCTTTGGTATCTTTGTGCAGATTTGGTCTACGTCATA 1262 

124 9 ATCTTCCCCCAGCTGCTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTG 1308 

II I I II I I I II I II II I I I I I I I II I II I I I II I M 

12 63 CT CTT CC CT C AACT AT TAT GT GTT GTAT AT AT GC CACGT AGCAAT AC GT AT GGCT C ATT G 1322 

1309 GCAGGTTATGTTTCTGGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTT 1368 

II I I I M I II II I II I I I II II II II I M I III II 

1323 GCTGGCTATGCAGTCGGTCTTGTGCTCCGTTTGATTGGAGGCGAGCCACTTGTATCGCTG 1382 

1369 CAGCC CTT GAT CTTCTACCCT GGCT ATTAC C CT GAT GAT AAT GGTAT AT AT AAT CAGAAA 1428 

I I II I I I I I I III I I I II I I 

1383 C CAGC GTT CTT C CATT AT CCAAT GT ATACGGAT G GGG TACAGTAT 1427 

1429 TTTCCATTTAAAACACTTGCCATGGTTACATCATTCTT7\ACCAACATTTGCATCTCCTAT 1488 



I I 1 1 1 1 1 I III 1 1 1 1 1 1 I I 1 1 1 I 1 1 I I III 

Db 1428 T T C C CAT T C AG GACAACT GCT AT GT TAT CT T CAAT GGCT ACT AT CT AC ATT GT AT CAAT A 1487 

Qy 148 9 CTAGCCAAGTATCTATTT GAAAGT GGAACCT T GCCACCTAAATTAGAT GTATTTGATGCT 1548 

III I I I I I II II III II I I I I II I I I I I I I I I 

D b 14 88 CAATCGGAGAAGCTGTTCAAATCGGGACGTTTGTCTCCGGAGTGGGACGTAATGGGTTGT 1547 

Qy 1549 GTTGT 1553 

I I I I 

Db 1548 GTAGT 1552 

RESULT 10 
US-09-974-300-501 

Sequence 501, Application US/09974300 
Patent No. US20020146721A1 
GENERAL INFORMATION: 
APPLICANT: Berka, Randy M. 
APPLICANT: Clausen, lb Groth 

TITLE OF INVENTION: Methods For Monitoring Multiple Gene 
TITLE OF INVENTION: Expression 
FILE REFERENCE: 10085. 500-US 
CURRENT APPLICATION' NUMBER: US/09/ 974 , 300 
CURRENT FILING DATE: 2001-10-05 
PRIOR APPLICATION NUMBER: 09/680,598 
PRIOR FILING DATE: 2000-10-06 
PRIOR APPLICATION NUMBER: 60/27 9,526 
PRIOR FILING DATE: 2001-03-27 
NUMBER OF SEQ ID NOS : 8481 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 501 
LENGTH: 1461 
TYPE: DNA 

ORGANISM: Bacillus lichenif ormis 
US-09-974-300-501 

Query Match 13.9%; Score 242.6; DB 9; Length 1461; 

Best Local Similarity 52.4%; Pred. No. 5.9e-62; 

Matches 663; Conservative 0; Mismatches 554; Indels 48; Gaps 4; 

TT ATT GGTT GGTGGATTTACCAT GACAGCTACCT GGGT CGGAGGAGGGT AT AT CAATGGC 210 

I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTTTTCGTCGGAATGGTGACGATGGCCGCAACATGGGTCGGCGGCGGATATATTAACGGA 184 

ACAGCTGAAGCAGTTTATGTACCAGGTTATGGCCTAGCTTGGGCTCAGGCACCAATTGGA 270 

| | I I I I I I I I I I I Mill HIM M II I I I 

ACGGCCGAATCGACTTACA GCGACGGCCTCATCTGGGCCCAAGCGCCTTGGGGC 238 

TATTCTCTTAGTCTGATTTTAGGTGGCCTGTTCTTTGCAAAACCTATGCGTTCAAAGGGG 330 

M I I I I I I I I I I I M I I I M I I I I I I I Mill I 

TACGCATTGAGCCTGATTATCGGCGGTATTTTCTTCGCCAGAAAAATGCGCCGTCATCAA 2 98 

TATGTGACCATGTTAGACCCGTTTCAGCAAATCTATGGAAAACGCATGGGCGGACTCCTG 390 

I I M I I I I I I II I I I III II M I I II M I I I I I I I 



Qy 


151 


Db 


125 


Qy 


211 


Db 


185 


Qy 


271 


Db 


239 


Qy 


331 


Db 


299 



Qy 



391 TTTATTCCTGCACTGATGGGAGAAATGTTCTGGGCTGCAGCAATTTTCTCTGCTTTGGGA 450 



359 TATATACCGGCGCTGTTAGGAGAATTGTTTTGGAGCGCCGCGATCTTAACGGCATTGGGC 418 

451 GCCACCAT CAGCGT GAT CAT CGAT GT GGAT AT GCACATTT CT GT CAT CAT CT CTGCACTC 510 

| | I III I I I I I I I I I I I I I I I II IN I I I I I I 
419 AC GACT T T C GGAAT GAT T CT GAAT AT C GATT T C C AAAC GT CGAT TAT T CT T T C GGC GAT G 478 

511 ATT GCCACT CT GT ACAC ACT GGT GGGAGGGCT CT AT T CT GT GGC CT ACACT GAT GTC GTT 57 0 

| | | | | | | | I I II I Mill II I II II I I I I II I M I I I 

479 ATCGCCATCGCTTATACGGTGGCCGGAGGCATGTGGGCAGTTGCTTTCACAGATGTCTTT 538 

571 CAGCTCTTTTGCATTTTTGTAGGGCTGTGGATCAGCGTCCCCTTTGCATTGTCACATCCT 630 

| | | || MINI I I I I I I I I I I I I I II I I I I I I I I I I M 

539 CAAATGATTGTCATTTTGCTCGGGCTGTTTTTAGTCGTCCCATTTGTATTGTCGAATGTC 598 

631 GCAG T CGCAGACAT CGGGTT CACT GCT GT GCAT GCCAA 668 

|| I I I I I I I I I I I I I I 

599 GGTGCTCTCGATAGCGTATGGGCAAATTACAGGCACGATTTCGGCAGCAGCGCCAATCTG 65 8 

669 AT ACCAAAAGC C GT GGCT GGGAACT GTT GACT CAT CT GAAGT CT ACT CT TGGCTT 72 3 

I | | II I I I I I I I I I I I Ml 

659 CTTCCGCCGTTGGACGGCTGGAAAAACCCGGATTGGGGAAACCTGTTTTGGAACTGGTGG 718 

724 GATAGTTTTCTGTTGTTGATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTT 783 

| | M | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

719 GATAATGCGCTCCTCTTAATTTTCGGAGGTATCGCATGGCAGGTGTACTTTCAGCGCGTT 77 8 

784 CTCTCTTCTTCCTCAGCCACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGC 84 3 

II I I I I I I I M I I Ml 

77 9 CTTTCGGCAAAATCGGAAAGCGCCGCCATGTGGCAGTCGATAATTGCCGGAGTGATCTGC 838 

844 CT GGT GAT GGCCAT CCCAGC CAT ACT CATT GGGGCCAT T GGAGCATCAACAGACT GGAAC 903 

| | I I I I I I I I II Nil II II I I I I I I I I I I I I 

839 ATCATTGCCGCCATTCCGTGCGTAATCATCGGAGCTGCCGGAAACAGTACCGATTGGAGC 898 

904 C AG ACT GCAT AT G GGCT T C C AGAT C C CAAGACT ACAGAAGAGGCAGAC AT GAT TTT AC C A 963 

|| || I III III I I I Ml I 

899 CTGTTCGGAGCGAGCGCTCCGGATAACCCGGCG ATGATTTTGCCG 943 

964 ATTGTTCTGCAGTATCTCTGCCCTGTGTATATTTCTTTCTTTGGTCTTGGTGCAGTTTCT 1023 

II I I I I I III II I I I I I I I I I I I I I I 

944 CAAACGCTTGCGTATTTGACGCCAGGAATCATCGCAGGCCTCGGCTTGGGTGCAATCGCA 1003 

1024 GCT GCT GT TAT GT CAT CAGC AGAT T CT T C CAT CT T GT C AG C AAGT T C CAT GT TT GCAC GG 1083 

M I M II I I I I I I I I I I I I I I M I 

1004 GCAG CC GT CAT GT C AAGC AT G GAC T CAT C GATT CT AT C GGCAT CAT CAAT G GCC GCAT GG 1063 

1084 AAC AT CT AC C AGCTT T C CT T C AGACAAAAT GCT T C GGACAAAGAAAT C GT T T GGGT TAT G 1143 

| | | I I I I I I I I I I I I I I I I I I I I I I II I 

1064 AAT AT T T AC C GT C C GCT CAT CAAG C C GAAGGC C AC C C AAAAAC AGCT GCAAAAAGT C GT C 1123 

1144 C GAAT C AC AGT GT T T GT GT TT GGAGC AT CT GCAAC AGC C AT GGCCTTGCT GAC GAAAACT 1203 

I | I I I I I I I I I I I I I I M I I Ml M M I I I I 

1124 AAACGCTCAATCATTTTGTTCGGCGCGGGAGCAGCGGTCATCGCGCTGAATGTCAAAAGC 1183 

1204 GTGTATGGGCTCTGGTACCTCAGTTCTGACCTTGTTTACATCGTTATCTTCCCCCAGCTG 1263 
Mill I I I I I I I I I I I I I I I I I I IMM II 



Db 



1184 GTTTATACTTTATGGTATTTGGCTTCGGATTTAGTTTATTGCATTCTTTTTCCCCAGTTA 1243 



Qy 1264 CTTTGTGTACTCTTTGTTAAGGGAACCAACACCTATGGGGCCGTGGCAGGTTATGTTTCT 1323 

I I I I I I I III I I I I I I I I I I I I II II I II I 

Db 1244 ACAATGGCCCTCTTTTATAAAAGAGCAAATCTTTACGGGTCGATTGCTGGATTTGCAGTT 1303 

Qy 1324 GGCCTCTTCCTGAGAATAACTGGAGGGGAGCCATATCTGTATCTTCAGCCCTTGATCTTC 1383 

I I I I I I I I I I II I I I I I I I III III I I 

Db 1304 GCAGTCATTCTGAGGCTCGGCGGTGGTGAACCCGCATTCGGCATTCCGCCGCTTCTGCCG 1363 

Qy 1384 TACCC 1388 

I I I I 

Db 1364 TATCC 1368 



RESULT 11 

US-09-911-077A-19/c 

Sequence 19, Application US/09911077A 
Publication No. US20030114399A1 
GENERAL INFORMATION: 
APPLICANT: BLAKELY, RANDY D. 
APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/09/911, 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 19 

LENGTH: 119040 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/ KEY : modif iedjaase 
LOCATION: (2347) . . (90873) 
OTHER INFORMATION: N = A, C, G or T/U 
US-09-911-077A-19 

Query Match 10.4%; Score 180.8; DB 10; Length 119040; 

Best Local Similarity 98.9%; Pred. No. 6.9e-42; 

Matches 182; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 94584 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 94525 

Qy 61 GTTGGAATATGGGCTGCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 94524 GT T G GAATAT GGGCT GC C T GGAGAAC CAAAAACAGT GGCAG C G CAGAAGAGC GCAG C GAA 94465 

Qy 121 GC CAT CATAGT T GGT GGC C GAGATAT T GGT T TAT T GGT T G GT GGAT T TAC CAT GACAGCT 180 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 944 64 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGGT 94 4 05 

Qy 181 ACCT 184 

I I I 



Db 



94404 ACGT 94401 



RESULT 12 

US-09-911-077A-14/C 

Sequence 14, Application US/09911077A 
Publication No. US20030114399A1 
GENERAL INFORMATION: 
APPLICANT: BLAKELY, RANDY D. 
APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/ 09/911 , 07 7A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 14 

LENGTH: 142299 
TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: Synthetic 
OTHER INFORMATION: Primer 
FEATURE : 

NAME/ KEY: modif ied_base 
LOCATION: (1305) . . (127835) 
OTHER INFORMATION: N = A, C, G or T/U 
US-09-911-077A-14 

Query Match 10.4%; Score 180.8; DB 10; Length 142299; 

Best Local Similarity 98.9%; Pred. No. 7.8e-42; 

Matches 182; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 60 

I I I I I I I I I I ! II I I i I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 64222 ATGGCTTTCCATGTGGAAGGACTGATAGCTATCATCGTGTTCTACCTTCTAATTTTGCTG 64163 

Qy 61 GTTGGAATAT GGGCT GCCTGGAGAACCAAAAACAGTGGCAGCGCAGAAGAGCGCAGCGAA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 64162 GTT GGAATAT GGGCT GCCT GGAGAACCAAAAACAGT GGCAGCGCAGAAGAGCGCAGCGAA 64103 

Qy 121 GC CAT CAT AGT T G GT GGC C GAGAT AT T G GT TT AT T GGT T GGT G GAT T T AC CAT G ACAGCT 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 64102 GCCATCATAGTTGGTGGCCGAGATATTGGTTTATTGGTTGGTGGATTTACCATGACAGGT 64043 

Qy 181 ACCT 184 

I I I 

Db 64042 ACGT 6.4039 



RESULT 13 

US-09-864-761-1838 

; Sequence 1838, Application US/09864761 

; Patent No. US20020048763A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 



; APPLICANT; Rank, David R. 

; APPLICANT: Hanzel, David K. 

; APPLICANT: Chen, Wensheng 

; TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR 

; TITLE OF INVENTION: GENE EXPRESSION ANALYSIS BY MICROARRAY 

; FILE REFERENCE: Aeomica-X-1 

; CURRENT APPLICATION NUMBER: US/09/8 64,761 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/180,312 

; PRIOR FILING DATE: 2000-02-04 

; PRIOR APPLICATION NUMBER: US 60/207,456 

; PRIOR FILING DATE: 2000-05-26 

; PRIOR APPLICATION NUMBER: US 09/632,366 

; PRIOR FILING DATE: 2000-08-03 

; PRIOR APPLICATION NUMBER: GB 24263.6 

; PRIOR FILING DATE: 2000-10-04 

; PRIOR APPLICATION NUMBER: US 60/236,359 

; PRIOR FILING DATE: 2000-09-27 

; PRIOR APPLICATION NUMBER: PCT/US01/00666 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00667 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00664 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00669 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00665 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00668 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00663 

/ PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00662 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00661 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/ 00670 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: US 60/234,687 

; PRIOR FILING DATE: 2000-09-21 

; PRIOR APPLICATION NUMBER: US 09/608,408 

; PRIOR FILING DATE: 2000-06-30 

; PRIOR APPLICATION NUMBER: US 09/774,203 

; PRIOR FILING DATE: 2001-01-29 

; NUMBER OF SEQ ID NOS: 49117 

; SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 

; SEQ ID NO 1838 

LENGTH: 455 
; TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

; OTHER INFORMATION: MAP TO AC009963.2 

; OTHER INFORMATION: EXPRESSED IN FETAL LIVER, SIGNAL =1.2 
; OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL =1.1 
OTHER INFORMATION: EXPRESSED IN HELA, SIGNAL =1.2 
OTHER INFORMATION: EXPRESSED IN PLACENTA, SIGNAL =1.3 



OTHER INFORMATION: EXPRESSED IN HBL100, SIGNAL =0.97 

OTHER INFORMATION: EXPRESSED IN HEART, SIGNAL =1.2 

OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL =1.2 

OTHER INFORMATION: EXPRESSED IN BONE MARROW, SIGNAL =1.3 
US-09-864-761-1838 



Query Match 8.9%; Score 155; DB 9; Length 455; 

Best Local Similarity 100.0%; Pred. No. 8.1e-36; 
Matches 155; Conservative 0; Mismatches 0; Indels 



0; Gaps 



Qy 



Db 



741 GATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTCTCTTCTTCCTCAGC 800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
266 GATGCTGGGTGGAATCCCATGGCAAGCATACTTTCAGAGGGTTCTCTCTTCTTCCTCAGC 325 



Qy 

Db 

Qy 

Db 



801 CACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTGGTGATGGCCATCCC 860 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

326 CACCTATGCTCAAGTGCTGTCCTTCCTGGCAGCTTTCGGGTGCCTGGTGATGGCCATCCC 385 

861 AGC CAT ACT CAT T GGGGCCAT T GGAGC AT CAACAG 895 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
386 AGC CAT ACT CAT T GGGGC CATT GGAGC AT CAACAG 420 



RESULT 14 

US-10-027-632-120553/C 

; Sequence 120553, Application US/10027632 

; Publication No. US20030204075A9 

; GENERAL INFORMATION: 

; APPLICANT: Wang, David G. 

; TITLE OF INVENTION: Identification and Mapping of Single Nucleotide 

; TITLE OF INVENTION: Polymorphisms in the Human Genome 

; FILE REFERENCE: 108827.129 

; CURRENT APPLICATION NUMBER: US/10/027,632 

; CURRENT FILING DATE: 2002-04-30 

; PRIOR APPLICATION NUMBER: US 60/218,006 

; PRIOR FILING DATE: 2000-07-12 

; PRIOR APPLICATION NUMBER: US 60/198,676 

; PRIOR FILING DATE: 2000-04-20 

; PRIOR APPLICATION NUMBER: US 60/193,483 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: US 60/185,218 

; PRIOR FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: US 60/167,363 

; PRIOR FILING DATE: 1999-11-23 

; PRIOR APPLICATION NUMBER: US 60/156,358 

; PRIOR FILING DATE: 1999-09-28 

; PRIOR APPLICATION NUMBER: US 60/146,002 

; PRIOR FILING DATE: 1999-08-09 

; NUMBER OF SEQ ID NOS : 325720 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 120553 

; LENGTH: 943 

; TYPE: DNA 

; ORGANISM: Human 

US-10-027- 632-12 0553 



Query Match 



6.8%; Score 118.6; DB 15; Length 943; 



Best Local Similarity 99.2%; Pred. No. 1.4e-24; 

Matches 118; Conservative 1; Mismatches 0; Indels 0; Gaps 0 



Qy 176 CAGCTACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAG 235 

I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 589 CAGCTACCTGGGTCGGAGGAGGGTATATCAATGGCACAGCTGAAGCAGTTTATGTACCAG 530 

Qy 236 GTTATGGCCTAGCTTGGGCTCAGGCACCAATTGGATATTCTCTTAGTCTGATTTTAGGT 294 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 529 GTTATGGCCTAGCTTGGGCTCAGGCACCARTTGGATATTCTCTTAGTCTGATTTTAGGT 471 



RESULT 15 

US-09-864-761-18589 

; Sequence 18589, Application US/09864761 

; Patent No. US20020048763A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel, David K. 

; APPLICANT: Chen, Wensheng 

; TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR 

; TITLE OF INVENTION: GENE EXPRESSION ANALYSIS BY MI CRO ARRAY 

; FILE REFERENCE: Aeomica-X-1 

; CURRENT APPLICATION NUMBER: US/09/864, 761 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/180,312 

; PRIOR FILING DATE: 2000-02-04 

; PRIOR APPLICATION NUMBER: US 60/207,456 

; PRIOR FILING DATE: 2000-05-26 

; PRIOR APPLICATION NUMBER: US 09/632,366 

; PRIOR FILING DATE: 2000-08-03 

; PRIOR APPLICATION NUMBER: GB 24263.6 

; PRIOR FILING DATE: 2000-10-04 

; PRIOR APPLICATION NUMBER: US 60/236,359 

; PRIOR FILING DATE: 2000-09-27 

; PRIOR APPLICATION NUMBER: PCT/US01/ 00666 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00667 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00664 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00669 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00665 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00668 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00663 

; PRIOR FILING DATE : 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00662 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00661 

; PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US01/00670 

; PRIOR FILING DATE: 2001-01-30 



PRIOR APPLICATION NUMBER: US 60/234,687 
PRIOR FILING DATE: 2000-09-21 
PRIOR APPLICATION NUMBER: US 09/608,408 
PRIOR FILING DATE: 2000-06-30 
PRIOR APPLICATION NUMBER: US 09/774,203 
PRIOR FILING DATE: 2001-01-29 
NUMBER OF SEQ ID NOS : 4 9117 

SOFTWARE: Annomax Sequence Listing Engine vers, 1.1 
SEQ ID NO 18589 
LENGTH: 96 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: MAP TO AC009963.2 

EXPRESSED IN FETAL LIVER, 
EXPRESSED IN ADULT LIVER, 
EXPRESSED IN HELA, SIGNAL =1.2 
EXPRESSED IN PLACENTA, SIGNAL =1.3 
EXPRESSED IN HBL100, SIGNAL =0.97 
EXPRESSED IN HEART, SIGNAL =1.2 
EXPRESSED IN BRAIN, SIGNAL =1.2 
EXPRESSED IN BONE MARROW, SIGNAL = 1 
NT HIT: gil!141884, EVALUE 5.00e-33 



OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
US-09-864-761-18589 



SIGNAL =1.2 
SIGNAL =1.1 



Query Match 4.1%; Score 72; DB 9; Length 96; 

Best Local Similarity 100.0%; Pred. No. 3.4e-ll; 
Matches 72; Conservative 0; Mismatches 0; Indels 



0; Gaps 



Qy 



Db 



824 TCCTGGCAGCTTTCGGGTGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTG 8 8 

I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I t I M I I I 

1 TCCTGGCAGCTTTCGGGTGCCTGGTGATGGCCATCCCAGCCATACTCATTGGGGCCATTG 60 



Qy 



Db 



884 GAGCAT CAACAG 895 

I I I I II I I I I I I 

61 GAGCAT CAACAG 72 



Search completed: March 22, 2004, 15:30:20 
Job time : 650 sees 



